AWS Batch Jobs

What is batch computing?

Batch computing means running jobs asynchronously and automatically, across one or more computers.

What is AWS Batch Job?

AWS Batch enables developers to easily and efficiently run hundreds of thousands of batch computing jobs on AWS. AWS Batch dynamically provisions the optimal quantity and type of compute resources (for example, CPU or memory optimized instances) based on the volume and specific resource requirements of the batch jobs submitted. AWS Batch plans, schedules, and executes your batch computing workloads across the full range of AWS compute services and features, such as Amazon EC2 and Spot Instances.

Why use AWS Batch Job ?

  • Fully managed infrastructure – No software to install or servers to manage. AWS Batch provisions, manages, and scales your infrastructure.
  • Integrated with AWS – Natively integrated with the AWS Platform, AWS Batch jobs can easily and securely interact with services such as Amazon S3, DynamoDB, and Recognition.
  • Cost-optimized Resource Provisioning – AWS Batch automatically provisions compute resources tailored to the needs of your jobs using Amazon EC2 and EC2 Spot.

AWS Batch Concepts

  • Jobs
  • Job Definitions
  • Job Queue
  • Compute Environments

Jobs

Jobs are the unit of work executed by AWS Batch as containerized applications running on Amazon EC2. Containerized jobs can reference a container image, command, and parameters or users can simply provide a .zip containing their application and AWS will run it on a default Amazon Linux container.

$ aws batch submit-job –job-name poller –job-definition poller-def –job-queue poller-queue

Job Dependencies

Jobs can express a dependency on the successful completion of other jobs or specific elements of an array job.

Use your preferred workflow engine and language to submit jobs. Flow-based systems simply submit jobs serially, while DAG-based systems submit many jobs at once, identifying inter-job dependencies.

Jobs run in approximately the same order in which they are submitted as long as all dependencies on other jobs have been met.

$ aws batch submit-job –depends-on 606b3ad1-aa31-48d8-92ec-f154bfc8215f …

Job Definitions

Similar to ECS Task Definitions, AWS Batch Job Definitions specify how jobs are to be run. While each job must reference a job definition, many parameters can be overridden.

Some of the attributes specified in a job definition are:

  • IAM role associated with the job
  • vCPU and memory requirements
  • Mount points
  • Container properties
  • Environment variables
$ aws batch register-job-definition –job-definition-name gatk –container-properties …

Job Queues

Jobs are submitted to a Job Queue, where they reside until they are able to be scheduled to a compute resource. Information related to completed jobs persists in the queue for 24 hours.

$ aws batch create-job-queue –job-queue-name genomics –priority 500 –compute-environment-order …

 

Compute Environments

Job queues are mapped to one or more Compute Environments containing the EC2 instances that are used to run containerized batch jobs.

Managed (Recommended) compute environments enable you to describe your business requirements (instance types, min/max/desired vCPUs, and EC2 Spot bid as x % of On-Demand) and AWS launches and scale resources on your behalf.

We can choose specific instance types (e.g. c4.8xlarge), instance families (e.g. C4, M4, R3), or simply choose “optimal” and AWS Batch will launch appropriately sized instances from AWS more-modern instance families.

Alternatively, we can launch and manage our own resources within an Unmanaged compute environment. Your instances need to include the ECS agent and run supported versions of Linux and Docker.

$ aws batch create-compute-environment –compute- environment-name unmanagedce –type UNMANAGED …

AWS Batch will then create an Amazon ECS cluster which can accept the instances we launch. Jobs can be scheduled to your Compute Environment as soon as the instances are healthy and register with the ECS Agent.

Job States

Jobs submitted to a queue can have the following states:

  • SUBMITTED: Accepted into the queue, but not yet evaluated for execution
  • PENDING: The job has dependencies on other jobs which have not yet completed
  • RUNNABLE: The job has been evaluated by the scheduler and is ready to run
  • STARTING: The job is in the process of being scheduled to a compute resource
  • RUNNING: The job is currently running
  • SUCCEEDED: The job has finished with exit code 0
  • FAILED: The job finished with a non-zero exit code or was cancelled or terminated.

AWS Batch Actions

  • Jobs: SubmitJob, ListJobs, DescribeJobs, CancelJob, TerminateJob
  • Job Definitions: RegisterJobDefinition, DescribeJobDefinitions, DeregisterJobDefinition
  • Job Queues: CreateJobQueue, DescribeJobQueues, UpdateJobQueue, DeleteJobQueue
  • Compute Environments: CreateComputeEnvironment, DescribeComputeEnvironments, UpdateComputeEnvironment, DeleteComputeEnvironment

AWS Batch Pricing

There is no charge for AWS Batch. We only pay for the underlying resources we have consumed.

Use Case

Poller and Processor Service

Purpose

Poller service needs to be run every hour like a cron job which submits one or more requests to a processor service which has to launch the required number of EC2 resource, process files in parallel and terminate them when done.

Solution

We plan to go with Serverless Architecture approach instead of using the traditional beanstalk/EC2 instance, as we don’t want to maintain and keep running EC2 server instance 24/7.

This approach will reduce our AWS billing cost as the EC2 instance launches when the job is submitted to Batch Job and terminates when the job execution is completed.

Poller Service Architecture Diagram

Processor Service Architecture Diagram

First time release

For Poller and Processor Service:

  • Create Compute environment
  • Create Job queue
  • Create Job definition

To automate above resource creation process, we use batchbeagle (for Installaion and configuration, please refer batch-deploymnent repository)

Command to Create/Update Batch Job Resources of a Stack (Creates all Job Descriptions, Job Queues and Compute Environments)

beagle -f stack/stackname/servicename.yml assemble

To start Poller service:

  • Enable a Scheduler using AWS CloudWatch rule to trigger poller service batch job.

Incremental release

We must create a new revision of existing Job definition environment which will point to the new release version tagged ECR image to be deployed.

Command to deploy new release version of Docker image to Batch Job (Creates a new revision of an existing Job Definition)

 

beagle -f stack/stackname/servicename.yml job update job-definition-name

Monitoring

Cloudwatch Events

We will use AWS Batch event stream for CloudWatch Events to receive near real-time notifications regarding the current state of jobs that have been submitted to your job queues.

AWS Batch sends job status change events to CloudWatch Events. AWS Batch tracks the state of your jobs. If a previously submitted job’s status changes, an event is triggered. For example, if a job in the RUNNING status moves to the FAILED status.

We will configure an Amazon SNS topic to serve as an event target which sends notification to lambda function which will then filter out relevant content from the SNS message (json) content and beautify it and send to the respective Environment slack channel .

CloudWatch Event Rule → SNS Topic → Lambda Function → Slack Channel

Batch Job Status Notification in Slack

Slack notification provides the following details:

  • Job name
  • Job Status
  • Job ID
  • Job Queue Name
  • Log Stream Name

Explanation of Git design principles through Git internals

In this blog we’ll explore the internals of how Git works. Having some behind-the-scenes working knowledge of Git will help you understand why Git is so much faster than traditional version control systems. Git also helps you recover data from unexpected crash/ delete scenarios.

For a developer, it is quite useful to understand the design principles of Git and see how it manages both speed of access (traversing to previous commits) and small disk space for repository.

In this blog, we will cover the following topics:

  • Initializing a new repository
  • Working directory and local repository
  • Git objects
  • Blob
  • Tree
  • Commit
  • Tag
  • Packs

I’m using Ubuntu 16.04 LTS, Zsh terminal and Git v2.7.4 for this blog, but you can use any operating system and terminal to follow along.

Initializing a new repository

  1. Initialize a new Git repository with ‘git init’ command.

 

  1. Create a couple of files by running the following commands.

$ echo “First file”>>first.txt

$ echo “Second file”>>second.txt

blog1

  1. Run ls –la to display all the contents of the folder.
    It should show .git directory and the two files we created.

blog2

Working directory and local repository

Working directory comprises of the files and folder we want to manage with Git Version Control System (VCS). In this case ~/testRepo folder constitutes our working directory except for ‘.git’ folder. .git folder forms the local repository. ‘.git’ folders contain everything that Git stores to manage our data.

 

Git objects

Git doesn’t store diff of the contents of the file. It stores snapshots of each file, that is  each version of the file is stored exactly as it is at the point it is staged and committed. This is done for faster access, one of the core principles for development of Git.

The latest version of the file is always stored as is in Git as it is most likely the one to be used. Also,  storing it makes it much faster for retrieve operations.

As further versions of the file are added, Git automatically creates pack files. We’ll discuss more on pack files later.

There are 4 types of git objects.

  • Blob
  • Tree
  • Commit
  • Tag

Let’s understand these with an example. First let’s check the contents of .git folder.

blog3

There are 5 directories and 3 files. Let’s start with objects.

blog5

Find command returned no results, i.e. there are currently no files in objects folder. Let’s stage our two files and then check.

blog6

We can see that there are two files. To explore these objects, we would need to use the command ’git cat-file

From the main pages of the Git cat-file, “Provides content or type and size information for repository objects”

blog7

-p argument pretty-prints the object contents and –t returns the type.

The objectId(SHA-1 Id) is subpath of the file from .git/objects folder without the slashes.

For example, object Id of .git/objects/20/d5b672a347112783818b3fc8cc7cd66ade3008 is 20d5b672a347112783818b3fc8cc7cd66ade3008.

The type that was returned for both the objects is blob.

So blobs are objects used for storing content of the file. Blobs just store the content, no other information like file name.

Let’s commit our code now with ‘git commit’ command.

blog8

Next, run ‘git log’ to retrieve the commit ID.

blog9

Copy the commit ID and then run cat-file commands on it.

blog10

The object type that is returned is ‘commit’. It contains reference to a tree object, author name, committer name and the commit message.

Now let’s check the tree object.

blog11

blog12

The tree object contains reference to the blob files we saw earlier, and also the reference to file names. We can summarize our Git repository state at this point with the following object diagram:

blog13

 

Let’s add a folder to our repository.

Run the following commands. We’ll use ‘git add’ to add folder from working directory to the local repository.

$ mkdir fol1

$ echo “Third File”>> fol1/third.txt

$ git add fol1

$ git commit -m “Second commit”

Inspect the second commit object.

blog14

Notice that the tree reference has changed. Also, there is a parent object property. So commit object also stores the parent commit’s id. Let’s inspect the tree.

blog15

The folder we just added, fol1, is stored as a tree and it contains reference to a blob referencing third.txt file. So tree objects reference blobs or other sub tree objects.

blog16

Now let’s discuss tags. Tags are used to provide an alias to a commit SHA ID for future reference/use.  There are two types of tags: lightweight tags and annotated tags.

  • Lightweight tags contain only the SHA-ID information.

blog17

The command ‘git tag light’ creates a file under .git/refs/tags/light which contains the commit Id on which the tag was created. No separate tag object is created. This is mostly used for development purposes, to easily remember and traverse back to a commit.

  • Annotated tags are usually used for release. They contain extra information like message and the tagger name along with the commit ID. Annotated tags can be created with ‘ git tag –a –m “<message>” ‘ command.

blog18

A separate tag object is created for annotated tag. You can list the tags created with ‘git tag’ command

Packs

  1. Although Git stores the contents of the latest versions of object intact, the older versions are stored as deltas in pack files. Let’s understand with an example. We would need a slightly larger file to see the difference in size of the delta file and original file. Download GNU license web page.
  2. Run the following command to download the license HTML file.

$ curl -L -O -C – https://www.gnu.org/licenses/gpl-3.0.en.html

You should now have gpl-3.0.en.H file in your working directory.

  1. Add and commit the file.

$ git add gpl-3.0.en.html

$ git commit -m “Added gpl file”

Inspect the commit and get the blob info of the added file.

git cat-file -p 53550a1c9325753eb44b1428a280bfb2cd5b90ef

blog19

The last command returns the size of the blob. That is the blob containing content of gpl-3.0.en.html is 49641 bytes.

4. Edit the file and commit it again.

blog20

A new blob is created with a slightly larger size. Now let’s check the original blob.

blog21

The original blog still exists. Thus for each change, a snapshot of the file is stored with its contents intact.

5. Let’s pack the files, with ‘git gc’ command.

As you can see, all of our existing blob and commit objects are gone and are now replaced with pack files.

blog22

6. We can run ‘git verify-pack’ command to inspect the pack file.

blog23

Now let’s check the highlighted entries.

The blob with commit ‘6be03f’ is the second version of gpl-3.0.en.HTML and the one with commit ‘0f6718’ is the first version.

The third column in output represents the blob size. As you can see, the first blob is now reduced to 9 bytes and references the second blob which maintains its size of 49652. Thus the first blob is stored as a delta of second blob although the first one is older. This is because the newer version is most likely to be the one to be used. Git automatically calls pack when pushing to remote repository.

Conclusion

In this blog we explored how Git stores the files internally and looked at the various types of Git objects, i.e, blob, tree and commit and how they are linked to each other. Then we also looked at packs, which explained how Git compresses older versions of a file and saves storage space.

Why Angular 2

Early this year, I began on a self-righteous (but approved, of course 😉 ) journey to make things right for my project — to break all shackles and limitations that the team faces in working with older technologies, methodologies and guidelines. When I started off, my aim was to make as minimal-but-essential changes to the system as possible, keeping in mind that my project is a live one, having at least 100 MegaBytes of code, with around 16 developers contributed to this application — in batches, of course — in the past 8 years, each having their own signature style of coding. Needless to say, there were more than a few tasks for research in this journey of mine.

One such, important analysis was to decide what UI methodology should the team adopt going forward. This blog post will focus on the analyses and decision-making process that made it happen and finally helped me decide on Angular 2.

Continue reading Why Angular 2

Build a Custom Solr Filter to Handle Unit Conversions

Recently, I came across a use case where it was required to handle units of weight in the index. For instance, 2kg and 2000g, when searched should return the same set of results.

So, for achieving the above, I wrote a custom Solr filter that will work along with KeywordTokenizer to convert all units of weight in the incoming request to a single unit (g) and hence every measurement will be saved in the form of a number; at the same time, it will also keep units like kg/g/mg intact while returning the docs. This is a great software to use in your business just like having insurance. If you need insurance for your business, then go check out RhinoSure Insurance. Another thing that you should do is go to mein-parteibuch.com so you can get more customers on your company website. Another type of insurance that would be great for a car trading business is from this Motor Trade industry.

Firstly, we need to write custom tokenfilter and tokenfilterfactory .

UnitConversionFilter.java

[code language=”java”]

package com.solr.custom.filter.test;
import java.io.IOException;

import org.apache.lucene.analysis.TokenFilter;
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;

/**
* @author SumeetS
*
*/
public class UnitConversionFilter extends TokenFilter{

private final CharTermAttribute termAtt = addAttribute(CharTermAttribute.class);

/**
* @param input
*/
public UnitConversionFilter(TokenStream input) {
super(input);
}

/* (non-Javadoc)
* @see org.apache.lucene.analysis.TokenStream#incrementToken()
*/
@Override
public boolean incrementToken() throws IOException {
if (input.incrementToken()) {
// charUtils.toLowerCase(termAtt.buffer(), 0, termAtt.length());
int length = termAtt.length();
String inputWt = termAtt.toString(); //assuming format to be 1kg/mg
float valInGrams = convertUnit(inputWt);
String storeFormat = valInGrams+””;
termAtt.setEmpty();
termAtt.copyBuffer(storeFormat.toCharArray(), 0, storeFormat.length());
return true;
} else
return false;
}

private float convertUnit(String field){
String [] tmp = field.split(“(k|m)?g”);
float weight = Integer.parseInt(tmp[0]);
String[] tmp2 = field.split(tmp[0]);
String unit = tmp2[1];
float convWt = 0;
switch(unit) {
case “kg”:
convWt = weight * 1000;
break;
case “mg”:
convWt = weight /1000;
break;
case “g”:
convWt = weight;
break;
}
return convWt;
}
}

[/code]

UnitConversionTokenFilterFactory.java

[code language=”java”]

package com.solr.custom.filter.test;
import java.util.Map;

import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.util.TokenFilterFactory;

/**
* @author SumeetS
*
*/
public class UnitConversionTokenFilterFactory extends TokenFilterFactory {

/**
* @param args
*/
public UnitConversionTokenFilterFactory(Map<String, String> args) {
super(args);
if (!args.isEmpty()) {
throw new IllegalArgumentException(“Unknown parameters: ” + args);
}
}

/* (non-Javadoc)
* @see org.apache.lucene.analysis.util.TokenFilterFactory#create(org.apache.lucene.analysis.TokenStream)
*/
@Override
public TokenStream create(TokenStream input) {
return new UnitConversionFilter(input);
}

}

[/code]

NOTE: When you override the TokenFilter and TokenFilterFactory, make sure to edit the protected constructors to public, otherwise it will throw NoSuchMethodException during plugin init.

Now, compile and export your above classes into a jar say customUnitConversionFilterFactory.jar

Steps to Deploy Your Jar Into Solr

1. Place your jar file under /lib

2. Make an entry in solrConfig.xml file to help it identify your custom jar.

[code language=”xml”]

<lib dir=”../../../lib/” regex=”.*\.jar” />

[/code]

3. Add custom fieldType and field in your schema.xml

[code language=”xml”]

<field name=”unitConversion” type=”unitConversion” indexed=”true” stored=”true”/>
<fieldType name=”unitConversion” class=”solr.TextField” positionIncrementGap=”100″>
<analyzer>
<tokenizer class=”solr.KeywordTokenizerFactory”/>
<filter class=”com.solr.custom.filter.test.UnitConversionTokenFilterFactory” />
</analyzer>
</fieldType>
[/code]

4. Now restart Solr and browse to the Solr console//documents

5. Add documents in your index like below:

{"id":"tmp1","unitConversion":"1000g"}
{"id":"tmp2","unitConversion":"2kg"}
{"id":"tmp3","unitConversion":"1kg"}

6. Query your index.

Query1 : querying for documents with 1kg

http://localhost:8983/solr/core1/select?q=*%3A*&fq=unitConversion%3A1kg&wt=json&indent=true

Result:

{
 "responseHeader":{
 "status":0,
 "QTime":0,
 "params":{
 "q":"*:*",
 "indent":"true",
 "fq":"unitConversion:1kg",
 "wt":"json"}},
 "response":{"numFound":2,"start":0,"docs":[
 {
 "id":"tmp1",
 "unitConversion":"1000g",
 "_version_":1524411029806645248},
 {
 "id":"tmp3",
 "unitConversion":"1kg",
 "_version_":1524411081738420224}]
 }}

Query2: querying for documents with 2kg

http://localhost:8983/solr/core1/select?q=*%3A*&fq=unitConversion%3A2kg&wt=json&indent=true

Result:

{
 "responseHeader":{
 "status":0,
 "QTime":0,
 "params":{
 "q":"*:*",
 "indent":"true",
 "fq":"unitConversion:2kg",
 "wt":"json"}},
 "response":{"numFound":1,"start":0,"docs":[
 {
 "id":"tmp2",
 "unitConversion":"2kg",
 "_version_":1524411089834475520}]
 }}

Query3: let’s try faceting

http://localhost:8983/solr/core1/select?q=*%3A*&rows=0&wt=json&indent=true&facet=true&facet.field=unitConversion

{
 "responseHeader":{
 "status":0,
 "QTime":1,
 "params":{
 "q":"*:*",
 "facet.field":"unitConversion",
 "indent":"true",
 "rows":"0",
 "wt":"json",
 "facet":"true"}},
 "response":{"numFound":335,"start":0,"docs":[]
 },
 "facet_counts":{
 "facet_queries":{},
 "facet_fields":{
 "unitConversion":[
 "1000.0",2,
 "2000.0",1]},
 "facet_dates":{},
 "facet_ranges":{},
 "facet_intervals":{},
 "facet_heatmaps":{}}}

This is just a basic implementation. One can add additional fields to identify the type of unit and then based on that decide the conversion.

Further improvements include handling of range queries along with the units.

For more info check us out in Social Media, we were recently able to Buy Instagram likes to improve our account.

Edge Side Includes (ESI)

Traditionally web applications were meant to serve static content, wherein the server generated an HTML response to the client’s request (typically HTTP) and sent it back to the client. The response was then rendered on user’s screen (browser window) by the client (browser). In order to increase user-perceived performance, such static content was cached at the edge of Internet so that it could be served faster. Content Delivery Networks (CDNs) have been used commercially for such purposes.

Using Revealing Module Pattern with Web Interface Design

I recently joined a project that required me to develop a mobile website for the client. We already had a fully functional desktop version of the website. There were around 10000+ lines of JavaScript code for the existing desktop website and most of it was not at all reusable for the mobile website. The purpose of this blog post is to identify what went wrong with the original design of the JavaScript code and to provide a solution that could have saved around two to three weeks of (re)work.

The Problems

Following are some issues that made us re-implement the complete functionality from scratch. Later the approach and guidelines that we implemented to avoid the same problem again will be discussed.

JQuery (mis)usage
Many methods implementing some business logic were used $(‘#id’) to access values. To reuse this code, I’d have had to use exactly same id/classes for elements.

Too many document.ready()
Each script included on the page had its own document ready handler. Multiple entry points made it hard to determine the flow of code.

Modular approach
The whole JavaScript code was organized as global functions. Need I say more?

Code Repetition
At some places, very similar or even the same functionality was implemented in different functions. Validation and error handling code was also repeated at a number of places.

Repeated element ids

Same elements id were used at a number of places.

What Affects What?
It was hard to determine which piece of code was intended for which part of the page. Some JavaScript files were being used on multiple pages. There each page was broken into several jspx files that may or may not be included on the page depending on some business logic. Also, some of those jspx had conditions of their own to add or skip some elements.

The Required Solution

Now that we have looked at the major problems with the existing JavaScript code, let’s take each of the problems and see what we can do to avoid them.
JQuery (mis)usage
To fix this, we needed to separate the business logic from UI logic. E.g. code to update user information on server should not be defined in the click handler of the submit button. We should, instead, put the ajax call in a separate function, taking the user details as parameters. The click handler should gather user details from form elements and pass them to the function. Similarly, use callbacks to update UI when a response is received from the server.

In UI code, we should be able to somewhere define ids of various form fields instead of hard coding field ids everywhere.

Too many document.ready()
There should be only one script with document ready handler, and it should be responsible to initialize various modules used on the page.
Modular approach & What affects what?
We needed an object oriented approach with business and UI logic in separate modules. Also, we needed some sort of mapping between the UI modules and jspx files. This would make it easier to determine exactly what code needs to be updated if something changes on the UI.
Code repetition
We needed to organize modules neatly. The business logic modules should have some sort of hierarchy based on functionality. The UI modules should maintain same hierarchy as the back end jspx files.
The validation and error handling code should be moved outside business and UI logic, in separate modules of their own.
Repeated element ids
It was simple to use unique ids in a single jspx file. However, on a completely rendered page we could still have repeated ids, especially when a jspx file was included more than once on a page. So, while inserting a jspx file, we also needed to wrap it inside a div container with unique id. We also needed some approach to be able to identify each element without specifying complete id-hierarchy.

Coding/Design Guidelines

To implement the changes suggested in the previous section, I had to write some utility modules to handle things such as validation, error handling, a JQuery wrapper to transparently access DOM elements with full id etc.

Following are some design and coding guidelines that I followed while implementing the mobile website.

Jspx Design Guidelines

  • Every id in a jspx file should be unique.
  • Every jspx file should be included inside a div container of its own; the container should be given a unique id in the jspx file.
  • All form fields should use a common markup style at all places e.g. each field should be placed inside a div container with proper classes such as ‘textbox’, ‘disabled’, ‘read-only’ etc

JavaScript coding guidelines

  • Organize the code in some meaningful namespaces, clearly separating business logic, UI logic and form validation and error handling code.
  • None of the code in business logic must interact with UI. Any of the business logic functions must accept required data and callback functions as arguments.
  • The business logic modules should be singletons unless it is required to create multiple objects of the module.
  • A UI module should be created for each jspx file.
  • For each jspx file included, a module must create objects of each of the module respective to included jspx files.
  • Any HTML element should be accessed only by its respective module.
  • While creating a child UI module, the parent module must pass the full id of its child jspx’s container div to the child module.
  • Define all form fields and HTML elements of the respective jspx accessed in the module.
  • While accessing any field by id in a UI module, the field id should be appended to the full id of the container div passed by the parent module. (We implemented a utility module to transparently handle this).
  • Define callbacks in each UI module for any event that might be interesting to the parent module.  e.g. submitButtonClicked, someCheckBoxChecked etc.
  • In case a child module needs to update parent module for some event, it must do so via callbacks. The parent module needs to set these callbacks at the time of creating the child module object.
  • Define public functions for actions requested by parent module. e.g. enableSubmitButton(), showSomeHiddenElement() etc.
  • In case a parent module needs to update UI for a child module, it must do via public functions exposed by the child module.
  • Don’t access HTML element attributes directly. For example, to hide elements don’t change display style. Create CSS classes for each required state of HTML elements and add/remove these classes to update any
  • Avoid creating id based CSS rules and using classes in JavaScript.
Summing up everything, the following is what we achieved by using the new approach for developing the mobile website:
  • The business logic is now completely separate from the UI code, and can be used independent of UI code.
  • The UI code has direct mapping with the jspx files. If a jspx file is modified, only its respective is modified.
  • Since all fields and HTML elements are defined in each module, changing ids of fields in jspx file requires just a single update in its module.
  • All JQuery dependent code has been moved to the ModuleHelper module. Although it might still require a lot of other changes, replacing JQuery with any other library requires minimal UI code changes.
  • Each HTML element is now uniquely identifiable by its module, even if the jspx file gets included multiple times on the same page.
  • Form validations and error handling are now completely removed from the UI code.

Further Reading:

Following are a few links for further reading on some basic JavaScript features and design patterns:
Functions in JavaScript are First Class Objects
JavaScript Closures
JavaScript Design Patterns

 

Building Web Services with JAX-WS : An Introduction

Developing SOAP based web services seem difficult because the message formats are complex.  JAX-WS API  makes it easy by hiding the complexity from the application developer. JAX-WS is the abbreviation for Java API for XML Web Services. JAX-WS is a technology used for building web services and clients that communicate using XML.

JAX-WS allows developers to write message-oriented as well as RPC-oriented web services.

A web service operation invocation in JAX-WS is represented by an XML-based protocol like SOAP. The envelope structure, encoding rules and conventions for representing web service invocations and responses are defined by the SOAP specification. These calls and responses are transmitted as SOAP messages over HTTP.

Though SOAP messages appear complex, the JAX-WS API hides this complexity from the application developer.  On the server-side, you can specify the web service operations by defining methods in an interface written in the Java programming Language. You need to code one or more classes that implement these methods.

It is equally easy to write client code. A client creates a proxy, a local object representing the service and then invokes methods on the proxy. With JAX-WS you do not generate or parse SOAP messages. Thanks to JAX-WS runtime system that converts the API calls and responses to and from SOAP messages!

With JAX-WS, web services and clients have a big advantage which is the platform independence of the Java programming language. Additionally JAX-WS is not restrictive: a JAX-WS client can access a web service that is not running on the Java platform and vice versa. This flexibility is possible because JAX-WS uses technologies defined by the World Wide Web Consortium – HTTP SOAP and the Web Service Description Language or WSDL. WSDL specifies an XML format to describe a service as a set of endpoints operating on messages.

Three most popular implementations of JAX-WS are:

Metro: It is developed and open sourced by Sun Microsystems. Metro incorporates the reference implementations of the JAXB 2.x data-binding and JAX-WS 2.x web services standards along with other XML-related Java standards. Here is the project link: http://jax-ws.java.net/. You can go through the documentation and download the implementation.

Axis: The original Apache Axis was based on the first Java standard for Web services which was JAX-RPC. This did not turn out to be a great approach because JAX-RPC constrained the internal design of the Axis code and caused performance issues and lack of flexibility. JAX-RPC also made some assumptions about the direction of Web services development, which turned out to be wrong!

By the time the Axis2 development started, replacement for JAX-RPC had already come into picture. So Axis2 was designed to be flexible enough to support the replacement web services standard on top of the base framework. Recent versions of Axis2 have implemented support for both the JAXB 2.x Java XML data-binding standard and the JAX-WS 2.x Java web services standard that replaced JAX-RPC(JAX-RPC: Java API for XML-based Remote Procedure Calls. Since RPC mechanism enables clients also to execute procedures on other systems, it is often used in a distributed client-server model. RPC in JAX-RPC is represented by an XML-based protocol such as SOAP). Here is the project link: http://axis.apache.org/axis2/java/core/

CXF: Another web services stack by the Apache. Though both Axis2 and CXF originated at Apache, they take very different approaches about how web services are configured and delivered. CXF is very well documented and has much more flexibility and additional functionality if you’re willing to go beyond the JAX-WS specification. It also supports Spring. Here is the project link: http://cxf.apache.org/

At Talentica we have used all the three implementations mentioned above in various projects.

Why CXF is my choice?
Every framework has its strengths but CXF is my choice. CXF has great integration with the Spring framework and I am a huge fan of Spring projects. It’s modular and easy to use. It has great community support and you can find lots of tutorials/resources online. It also supports both JAX-WS and JAX-RS specification but that should not be a considering factor while choosing CXF. Performance-wise it is better than AXIS and it gives almost similar performance when compared to Metro.

So CXF is my choice, what is yours?

Multi-server Applications on the Wireless Web

Here we will discuss how we can build Web applications that can serve wireless clients according to client capabilities.

What are the challenges?
Development of mobile applications is often highly dependent on the target platform. When developing any mobile content portal we generally think about the accessibility of that portal through the mobile browsers (like Nokia, Openwave, i-mode browsers, AvantGo in PDA etc) which generally use markup languages like WML, HDML, cHTML, XHTML etc. We want to ensure that the browser gets the compatible markup language and can present the portal content in correct format. In short, creating a wireless application that works on as many devices as possible is not difficult, it’s useless. If you invest a huge amount of resources today, chance are that a new device will be shipped tomorrow and you‘ll need to tweak your application again.

What is the solution?
Wireless Universal Resource File (WURFL) is an open source project that uses XML to describe the capabilities of wireless devices. It is a database (some call it a “repository”) of wireless device capabilities. With WURFL, figuring out which phone works with which technology is a whole lot easier. We can use the WURFL to figure out device capabilities programmatically and to serve different content to different devices dynamically, depending on the device accessing the content.

Here are some of the things WURFL can help you know about a device:

  • Screen size of the device
  • Supported image, audio, video, ringtone, wallpaper, and screensaver formats
  • Whether the device supports Unicode
  • Whether it is a wireless device? What markup does it support?
  • What XHTML MP/WML/cHTML features does it support? Does it work with tables? Can it work with standard HTML?
  • Does it have a pointing device? Can it use CSS?
  • Does it have Flash Lite/J2ME support? What features?
  • Can images be used as links on this device? Can it display image and text on the same line?
  • If it is an iMode phone, what region is it from? Japan, US or Europe?
  • Does the device auto-expand a select drop down? Does it have inline input for text fields?
  • What SMS/MMS features are supported?

WURFL framework also contains tools, utilities and libraries to parse and query the stored data in WURFL. WURFL API is available in many programming languages, including Java, PHP, .Net, Ruby, and Python. Various open source tools are build around this WURL – HAWHAW(PHP), WALL(Java) , HAWHAW.NET (.Net framework) , HawTag (JSP Custom tag library etc).

How does WURFL work?
When a mobile or non-mobile web browser visits your site, it sends a User Agent along with the request for your page. The user agent contains information about the type of device and browser that is being used. Unfortunately, this information is very limited and at times is not representative of the actual device. Using WURFL API, the framework then extracts the capabilities associated with that device. Based on the device capabilities, the framework creates the dynamic content – WML, HTML, XHTML etc.

Though there is concern with the extra latency time taken due to user-agent look up, it’s worth to use it looking at its advantages. One of the biggest advantages is regarding a new device if and when it enters the market, we will not need to change our application, but just update the WURFL to keep the application optimized. It is very simple and the architecture is sound. Go for it!!!