The Rise of Service Mesh: How Can Businesses Use It

Service Mesh

Docker containers, Microservice, Kubernetes are already old news — or at least, they’ve become so mainstream that they are no longer cutting-edge technology. I was curious to know what comes after those. If you ever pay close attention to all the latest and greatest technology news, I am sure you will hear a lot about the new “Service Mesh“. I spent some time understanding what the service mesh is all about and it begins with a very basic question “Why do we need service mesh?”. Through this article, I’ll also try to answer some other important questions which might be helpful for you to decide if service mesh is the next mainstream technology, or is it worth introducing service mesh to your application tech stack?

Why do we need Service Mesh? What problem it is trying to solve?

In a microservice architecture, handling service to service communication is challenging and most of the time we depend upon third-party libraries or components to provide functionalities like service discovery, load balancing, circuit breaker, metrics, telemetry, and more.

However, these components need to be configured inside your application code, and based on the language you are using, the implementation will vary a bit. Anytime these external components are upgraded, you need to update your application, verify it, and deploy the changes. This also creates an issue where now your application code is a mixture of business functionalities and these additional configurations. This tight coupling increases the overall application complexity.

This is where Service Mesh comes to the rescue. It decouples this complexity from your application and puts it in a service proxy called sidecar. Instead of calling services directly over the network, services call their local sidecar proxy, which in turn manages the request on the service’s behalf, thus encapsulating the complexities of the service-to-service exchange. These sidecars can provide you with a bunch of functionalities like

  • traffic management,
  • circuit breaking,
  • service discovery,
  • authentication,
  • monitoring,
  • rate limit
  • security,
  • and much more

According to Bilgin Ibryam, product manager at RedHat, there is a shift from “smart endpoints, dumb pipes” to “smart sidecars, dumb pipes;” all the intelligence is not in the endpoint but in the sidecar. You have your business logic pod forming the core of the application and the sidecar pod that offers out-of- the-box distributed primitives like handling all network communication challenges.

(A typical Service Mesh Architecture. Business pod talking to local sidecar proxy for service to service communication, service mesh providing various cross- cutting concerns)

Features Provided by Service Mesh

A service mesh implementation will typically offer you one or more following features:

  • Normalizes naming and adds logical routing, (e.g., maps the code-level name “user-service” to the platform specific location “AWS-us-east- 1a/prod/users/v4”)
  • Provides traffic shaping and traffic shifting
  • Maintains load balancing, typically with configurable algorithms
  • Provides service release control (e.g., canary releasing and A/B type experimentation)
  • Offers per-request routing (e.g., traffic shadowing, fault injection, and debug re-routing)
  • Adds baseline reliability, such as health checks, timeouts/deadlines, circuit breaking, and retry (budgets)
  • Increases security, via transparent mutual Transport Level Security (TLS) and policies such as Access Control Lists (ACLs)
  • Provides additional observability and monitoring, such as top-line metrics (request volume, success rates, and latencies), support for distributed tracing, and the ability to “tap” and inspect real-time service-to-service communication
  • Enables platform teams to configure “sane defaults” to protect the system from bad communication

Things to be aware of while introducing service mesh to your application tech stack

A service mesh provides a whole bunch of features to facilitate service to service communication in a microservice architecture. But there is no such thing as “silver bullet” with IT. There is an operational cost for deploying and running a service mesh. Without running any business application of our own, Isto runs about 45 containers to use all its service mesh features which introduce an on-going running cost.

There is a good write up from Michael Kippers here. They looked at using Istio for their main applications. They did a bunch of proof of concept and they deployed it to a production-like environment on what they found was that the cost of running Istio just the extra compute power to run the control plane was doing cost him an extra $50,000 a month Just have the CPU and memory for all the feature of Istio

What Does the Future Hold?

Using Service mesh for Event-Driven Messaging

Popular implementations of service meshes (Istio/Envoy, Linkerd, etc.) currently only cater to the request/response style of synchronous communication between microservices. However, interservice communication takes place over a diverse set of patterns, such as request/ response (HTTP, gRPC, Graph- QL) and event- driven messaging (NATS, Kafka, AMQP) in most pragmatic microservices use cases.

There are two main emerging architectural patterns for implementing messaging support within a service mesh:

The protocol proxy sidecar

The protocol-proxy pattern is built around the concept that all the event-driven communication channels should go through the service-mesh data plane (i.e., the sidecar proxy). To support event-driven messaging protocols such as NATS, Kafka, or AMQP, you need to build a protocol handler/filter specific to the communication protocol and add that to the sidecar proxy. The following Figure shows the typical communication pattern for event-driven messaging with a service mesh.

The Envoy team is currently working on implementing Kafka support for the Envoy proxy based on the above pattern.

The HTTP bridge sidecar

Rather than using a proxy for the event-driven messaging protocol, we can build an HTTP bridge that can translate messages to/from the required messaging protocol. The sidecar proxy is primarily responsible for receiving HTTP requests and translating them into Kafka/ NATS/AMQP/etc. messages and vice versa As shown below in figure.

A Unified, Standard API for Consolidating Service Meshes (SMI)

The Service Mesh Interface (SMI) defines a set of common and portable APIs that aims to provide developers with interoperability across different service mesh technologies including Istio, Linkerd, and Consul Connect.

First SMI reference implementation is built by call Service MeshHub

Getting Started with Service Mesh

There is an amazing Pluralsight course, Managing App on Kubernetes with Istio, which not only introduces you to the concept of service mesh using Istio, it also talks about cost and how to run Istio in production. For those who are starting to explore service Mesh the course is highly recommended.

Top Service Mesh Technologies

At the heart of many service meshes is Envoy, a general-purpose, open-source proxy used as a sidecar to intercept traffic. Other service meshes utilize a different proxy.

When it comes to service mesh adoption, Istio and Linkerd are more established. Yet many other options exist, including Consul Connect, Kuma, AWS App Mesh, and OpenShift. Below, here are the key features from top service mesh offerings.


Istio is an extensible open-source service mesh built on Envoy, allowing teams to connect, secure, control, and observe services. Istio was used internally within IBM then open-sourced in 2017, now is framed as a collaboration among Google, IBM, and Lyft.


Linkerd is an “ultralight, security-first service mesh for Kubernetes,” according to the website. It’s a developer favorite, with incredibly easy setup (purportedly 60 seconds to install to a Kubernetes cluster). Instead of Envoy, Linkerd uses a fast and lean Rust proxy called linkerd2-proxy, which was built explicitly for Linkerd.

Consul Connect

Consul Connect, the service mesh from HashiCorp, focuses on routing and segmentation, providing service-to-service networking features through an application-level sidecar proxy.

Something unique about Consul Connect is that you have two proxy options. Connect offers its own built-in layer proxy for testing, but also supports Envoy. Connect emphasizes observability, providing integration with tools to monitor data from sidecar proxies, such as Prometheus. Consul Connect also is flexible for developer needs. For example, it offers many options for registering services: from an orchestrator, with configuration files, via API, or via command-line interface (CLI).


Kuma, from Kong, is a platform-agnostic control plane built on Envoy. Kuma provides networking features to secure, observe, route, and enhance connectivity between services. Kuma supports Kubernetes in addition to virtual machines.

What’s interesting about Kuma is that an enterprise can operate and control multiple isolated meshes from a unified control plane. This ability could be beneficial to high-security use cases that require segmentation and centralized control


Maesh, the container-native service mesh by Containous, built itself as lightweight and more straightforward to use than other service meshes on the market. While other meshes build on top of Envoy, Maesh adopts Traefik, an open-source reverse proxy, and load balancer.

Instead of adopting a sidecar container format, Maesh uses proxy endpoints for each node. Doing so makes Maesh more non-invasive than other meshes, since it does not edit Kubernetes objects or modify traffic without opt-in. Maesh supports a couple of configuration options: annotations on user service objects as well as Service Mesh Interface SMI objects.

Indeed, its support for SMI, a new standard service mesh specification format, is one unique trait of Maesh. If SMI adoption increases throughout the industry, it could offer extensibility benefits and potentially reduce vendor lock-in concerns.

AWS App Mesh

Amazon Web Services’ App Mesh provides “application-level networking for all your services.” It manages all network traffic for services and uses the open- source Envoy proxy to control traffic into and out of a service’s containers. AWS App Mesh supports HTTP/2 gRPC services.

AWS App Mesh could be a good service mesh option for companies already married to the AWS infrastructure for their container platforms.

OpenShift Service Mesh by Red Hat

OpenShift Service Mesh builds on top of open-source Istio, bringing the Istio control and data plane features. OpenShift enhances Istio with tracing and visibility features powered by two open-source tools. OpenShift uses Jaeger for distributed tracing, permitting better tracking of how requests are handled between services.

OpenShift also uses Kiali for added observability into microservices configuration, traffic monitoring, and tracing analysis.


Looking at the operation and running cost of the service mesh, it makes sense to use service mesh where you are running tons of services in production. Also, use service mesh if your application already has or planning to implement features provided by service mesh.

References: ticles/the-rise-of-service-mesh-architecture compared istio-linkerd

AWS Batch Jobs

What is batch computing?

Batch computing means running jobs asynchronously and automatically, across one or more computers.

What is AWS Batch Job?

AWS Batch enables developers to easily and efficiently run hundreds of thousands of batch computing jobs on AWS. AWS Batch dynamically provisions the optimal quantity and type of compute resources (for example, CPU or memory optimized instances) based on the volume and specific resource requirements of the batch jobs submitted. AWS Batch plans, schedules, and executes your batch computing workloads across the full range of AWS compute services and features, such as Amazon EC2 and Spot Instances.

Why use AWS Batch Job ?

  • Fully managed infrastructure – No software to install or servers to manage. AWS Batch provisions, manages, and scales your infrastructure.
  • Integrated with AWS – Natively integrated with the AWS Platform, AWS Batch jobs can easily and securely interact with services such as Amazon S3, DynamoDB, and Recognition.
  • Cost-optimized Resource Provisioning – AWS Batch automatically provisions compute resources tailored to the needs of your jobs using Amazon EC2 and EC2 Spot.

AWS Batch Concepts

  • Jobs
  • Job Definitions
  • Job Queue
  • Compute Environments


Jobs are the unit of work executed by AWS Batch as containerized applications running on Amazon EC2. Containerized jobs can reference a container image, command, and parameters or users can simply provide a .zip containing their application and AWS will run it on a default Amazon Linux container.

$ aws batch submit-job –job-name poller –job-definition poller-def –job-queue poller-queue

Job Dependencies

Jobs can express a dependency on the successful completion of other jobs or specific elements of an array job.

Use your preferred workflow engine and language to submit jobs. Flow-based systems simply submit jobs serially, while DAG-based systems submit many jobs at once, identifying inter-job dependencies.

Jobs run in approximately the same order in which they are submitted as long as all dependencies on other jobs have been met.

$ aws batch submit-job –depends-on 606b3ad1-aa31-48d8-92ec-f154bfc8215f …

Job Definitions

Similar to ECS Task Definitions, AWS Batch Job Definitions specify how jobs are to be run. While each job must reference a job definition, many parameters can be overridden.

Some of the attributes specified in a job definition are:

  • IAM role associated with the job
  • vCPU and memory requirements
  • Mount points
  • Container properties
  • Environment variables
$ aws batch register-job-definition –job-definition-name gatk –container-properties …

Job Queues

Jobs are submitted to a Job Queue, where they reside until they are able to be scheduled to a compute resource. Information related to completed jobs persists in the queue for 24 hours.

$ aws batch create-job-queue –job-queue-name genomics –priority 500 –compute-environment-order …


Compute Environments

Job queues are mapped to one or more Compute Environments containing the EC2 instances that are used to run containerized batch jobs.

Managed (Recommended) compute environments enable you to describe your business requirements (instance types, min/max/desired vCPUs, and EC2 Spot bid as x % of On-Demand) and AWS launches and scale resources on your behalf.

We can choose specific instance types (e.g. c4.8xlarge), instance families (e.g. C4, M4, R3), or simply choose “optimal” and AWS Batch will launch appropriately sized instances from AWS more-modern instance families.

Alternatively, we can launch and manage our own resources within an Unmanaged compute environment. Your instances need to include the ECS agent and run supported versions of Linux and Docker.

$ aws batch create-compute-environment –compute- environment-name unmanagedce –type UNMANAGED …

AWS Batch will then create an Amazon ECS cluster which can accept the instances we launch. Jobs can be scheduled to your Compute Environment as soon as the instances are healthy and register with the ECS Agent.

Job States

Jobs submitted to a queue can have the following states:

  • SUBMITTED: Accepted into the queue, but not yet evaluated for execution
  • PENDING: The job has dependencies on other jobs which have not yet completed
  • RUNNABLE: The job has been evaluated by the scheduler and is ready to run
  • STARTING: The job is in the process of being scheduled to a compute resource
  • RUNNING: The job is currently running
  • SUCCEEDED: The job has finished with exit code 0
  • FAILED: The job finished with a non-zero exit code or was cancelled or terminated.

AWS Batch Actions

  • Jobs: SubmitJob, ListJobs, DescribeJobs, CancelJob, TerminateJob
  • Job Definitions: RegisterJobDefinition, DescribeJobDefinitions, DeregisterJobDefinition
  • Job Queues: CreateJobQueue, DescribeJobQueues, UpdateJobQueue, DeleteJobQueue
  • Compute Environments: CreateComputeEnvironment, DescribeComputeEnvironments, UpdateComputeEnvironment, DeleteComputeEnvironment

AWS Batch Pricing

There is no charge for AWS Batch. We only pay for the underlying resources we have consumed.

Use Case

Poller and Processor Service


Poller service needs to be run every hour like a cron job which submits one or more requests to a processor service which has to launch the required number of EC2 resource, process files in parallel and terminate them when done.


We plan to go with Serverless Architecture approach instead of using the traditional beanstalk/EC2 instance, as we don’t want to maintain and keep running EC2 server instance 24/7.

This approach will reduce our AWS billing cost as the EC2 instance launches when the job is submitted to Batch Job and terminates when the job execution is completed.

Poller Service Architecture Diagram

Processor Service Architecture Diagram

First time release

For Poller and Processor Service:

  • Create Compute environment
  • Create Job queue
  • Create Job definition

To automate above resource creation process, we use batchbeagle (for Installaion and configuration, please refer batch-deploymnent repository)

Command to Create/Update Batch Job Resources of a Stack (Creates all Job Descriptions, Job Queues and Compute Environments)

beagle -f stack/stackname/servicename.yml assemble

To start Poller service:

  • Enable a Scheduler using AWS CloudWatch rule to trigger poller service batch job.

Incremental release

We must create a new revision of existing Job definition environment which will point to the new release version tagged ECR image to be deployed.

Command to deploy new release version of Docker image to Batch Job (Creates a new revision of an existing Job Definition)


beagle -f stack/stackname/servicename.yml job update job-definition-name


Cloudwatch Events

We will use AWS Batch event stream for CloudWatch Events to receive near real-time notifications regarding the current state of jobs that have been submitted to your job queues.

AWS Batch sends job status change events to CloudWatch Events. AWS Batch tracks the state of your jobs. If a previously submitted job’s status changes, an event is triggered. For example, if a job in the RUNNING status moves to the FAILED status.

We will configure an Amazon SNS topic to serve as an event target which sends notification to lambda function which will then filter out relevant content from the SNS message (json) content and beautify it and send to the respective Environment slack channel .

CloudWatch Event Rule → SNS Topic → Lambda Function → Slack Channel

Batch Job Status Notification in Slack

Slack notification provides the following details:

  • Job name
  • Job Status
  • Job ID
  • Job Queue Name
  • Log Stream Name

Explanation of Git design principles through Git internals

In this blog we’ll explore the internals of how Git works. Having some behind-the-scenes working knowledge of Git will help you understand why Git is so much faster than traditional version control systems. Git also helps you recover data from unexpected crash/ delete scenarios.

For a developer, it is quite useful to understand the design principles of Git and see how it manages both speed of access (traversing to previous commits) and small disk space for repository.

In this blog, we will cover the following topics:

  • Initializing a new repository
  • Working directory and local repository
  • Git objects
  • Blob
  • Tree
  • Commit
  • Tag
  • Packs

I’m using Ubuntu 16.04 LTS, Zsh terminal and Git v2.7.4 for this blog, but you can use any operating system and terminal to follow along.

Initializing a new repository

  1. Initialize a new Git repository with ‘git init’ command.


  1. Create a couple of files by running the following commands.

$ echo “First file”>>first.txt

$ echo “Second file”>>second.txt


  1. Run ls –la to display all the contents of the folder.
    It should show .git directory and the two files we created.


Working directory and local repository

Working directory comprises of the files and folder we want to manage with Git Version Control System (VCS). In this case ~/testRepo folder constitutes our working directory except for ‘.git’ folder. .git folder forms the local repository. ‘.git’ folders contain everything that Git stores to manage our data.


Git objects

Git doesn’t store diff of the contents of the file. It stores snapshots of each file, that is  each version of the file is stored exactly as it is at the point it is staged and committed. This is done for faster access, one of the core principles for development of Git.

The latest version of the file is always stored as is in Git as it is most likely the one to be used. Also,  storing it makes it much faster for retrieve operations.

As further versions of the file are added, Git automatically creates pack files. We’ll discuss more on pack files later.

There are 4 types of git objects.

  • Blob
  • Tree
  • Commit
  • Tag

Let’s understand these with an example. First let’s check the contents of .git folder.


There are 5 directories and 3 files. Let’s start with objects.


Find command returned no results, i.e. there are currently no files in objects folder. Let’s stage our two files and then check.


We can see that there are two files. To explore these objects, we would need to use the command ’git cat-file

From the main pages of the Git cat-file, “Provides content or type and size information for repository objects”


-p argument pretty-prints the object contents and –t returns the type.

The objectId(SHA-1 Id) is subpath of the file from .git/objects folder without the slashes.

For example, object Id of .git/objects/20/d5b672a347112783818b3fc8cc7cd66ade3008 is 20d5b672a347112783818b3fc8cc7cd66ade3008.

The type that was returned for both the objects is blob.

So blobs are objects used for storing content of the file. Blobs just store the content, no other information like file name.

Let’s commit our code now with ‘git commit’ command.


Next, run ‘git log’ to retrieve the commit ID.


Copy the commit ID and then run cat-file commands on it.


The object type that is returned is ‘commit’. It contains reference to a tree object, author name, committer name and the commit message.

Now let’s check the tree object.



The tree object contains reference to the blob files we saw earlier, and also the reference to file names. We can summarize our Git repository state at this point with the following object diagram:



Let’s add a folder to our repository.

Run the following commands. We’ll use ‘git add’ to add folder from working directory to the local repository.

$ mkdir fol1

$ echo “Third File”>> fol1/third.txt

$ git add fol1

$ git commit -m “Second commit”

Inspect the second commit object.


Notice that the tree reference has changed. Also, there is a parent object property. So commit object also stores the parent commit’s id. Let’s inspect the tree.


The folder we just added, fol1, is stored as a tree and it contains reference to a blob referencing third.txt file. So tree objects reference blobs or other sub tree objects.


Now let’s discuss tags. Tags are used to provide an alias to a commit SHA ID for future reference/use.  There are two types of tags: lightweight tags and annotated tags.

  • Lightweight tags contain only the SHA-ID information.


The command ‘git tag light’ creates a file under .git/refs/tags/light which contains the commit Id on which the tag was created. No separate tag object is created. This is mostly used for development purposes, to easily remember and traverse back to a commit.

  • Annotated tags are usually used for release. They contain extra information like message and the tagger name along with the commit ID. Annotated tags can be created with ‘ git tag –a –m “<message>” ‘ command.


A separate tag object is created for annotated tag. You can list the tags created with ‘git tag’ command


  1. Although Git stores the contents of the latest versions of object intact, the older versions are stored as deltas in pack files. Let’s understand with an example. We would need a slightly larger file to see the difference in size of the delta file and original file. Download GNU license web page.
  2. Run the following command to download the license HTML file.

$ curl -L -O -C –

You should now have gpl-3.0.en.H file in your working directory.

  1. Add and commit the file.

$ git add gpl-3.0.en.html

$ git commit -m “Added gpl file”

Inspect the commit and get the blob info of the added file.

git cat-file -p 53550a1c9325753eb44b1428a280bfb2cd5b90ef


The last command returns the size of the blob. That is the blob containing content of gpl-3.0.en.html is 49641 bytes.

4. Edit the file and commit it again.


A new blob is created with a slightly larger size. Now let’s check the original blob.


The original blog still exists. Thus for each change, a snapshot of the file is stored with its contents intact.

5. Let’s pack the files, with ‘git gc’ command.

As you can see, all of our existing blob and commit objects are gone and are now replaced with pack files.


6. We can run ‘git verify-pack’ command to inspect the pack file.


Now let’s check the highlighted entries.

The blob with commit ‘6be03f’ is the second version of gpl-3.0.en.HTML and the one with commit ‘0f6718’ is the first version.

The third column in output represents the blob size. As you can see, the first blob is now reduced to 9 bytes and references the second blob which maintains its size of 49652. Thus the first blob is stored as a delta of second blob although the first one is older. This is because the newer version is most likely to be the one to be used. Git automatically calls pack when pushing to remote repository.


In this blog we explored how Git stores the files internally and looked at the various types of Git objects, i.e, blob, tree and commit and how they are linked to each other. Then we also looked at packs, which explained how Git compresses older versions of a file and saves storage space.

Why Angular 2

Early this year, I began on a self-righteous (but approved, of course 😉 ) journey to make things right for my project — to break all shackles and limitations that the team faces in working with older technologies, methodologies and guidelines. When I started off, my aim was to make as minimal-but-essential changes to the system as possible, keeping in mind that my project is a live one, having at least 100 MegaBytes of code, with around 16 developers contributed to this application — in batches, of course — in the past 8 years, each having their own signature style of coding. Needless to say, there were more than a few tasks for research in this journey of mine.

One such, important analysis was to decide what UI methodology should the team adopt going forward. This blog post will focus on the analyses and decision-making process that made it happen and finally helped me decide on Angular 2.

Continue reading Why Angular 2

Build a Custom Solr Filter to Handle Unit Conversions

Recently, I came across a use case where it was required to handle units of weight in the index. For instance, 2kg and 2000g, when searched should return the same set of results.

So, for achieving the above, I wrote a custom Solr filter that will work along with KeywordTokenizer to convert all units of weight in the incoming request to a single unit (g) and hence every measurement will be saved in the form of a number; at the same time, it will also keep units like kg/g/mg intact while returning the docs. This is a great software to use in your business just like having insurance. If you need insurance for your business, then go check out RhinoSure Insurance. Another thing that you should do is go to so you can get more customers on your company website. Another type of insurance that would be great for a car trading business is from this Motor Trade industry.

Firstly, we need to write custom tokenfilter and tokenfilterfactory .

[code language=”java”]

package com.solr.custom.filter.test;

import org.apache.lucene.analysis.TokenFilter;
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;

* @author SumeetS
public class UnitConversionFilter extends TokenFilter{

private final CharTermAttribute termAtt = addAttribute(CharTermAttribute.class);

* @param input
public UnitConversionFilter(TokenStream input) {

/* (non-Javadoc)
* @see org.apache.lucene.analysis.TokenStream#incrementToken()
public boolean incrementToken() throws IOException {
if (input.incrementToken()) {
// charUtils.toLowerCase(termAtt.buffer(), 0, termAtt.length());
int length = termAtt.length();
String inputWt = termAtt.toString(); //assuming format to be 1kg/mg
float valInGrams = convertUnit(inputWt);
String storeFormat = valInGrams+””;
termAtt.copyBuffer(storeFormat.toCharArray(), 0, storeFormat.length());
return true;
} else
return false;

private float convertUnit(String field){
String [] tmp = field.split(“(k|m)?g”);
float weight = Integer.parseInt(tmp[0]);
String[] tmp2 = field.split(tmp[0]);
String unit = tmp2[1];
float convWt = 0;
switch(unit) {
case “kg”:
convWt = weight * 1000;
case “mg”:
convWt = weight /1000;
case “g”:
convWt = weight;
return convWt;


[code language=”java”]

package com.solr.custom.filter.test;
import java.util.Map;

import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.util.TokenFilterFactory;

* @author SumeetS
public class UnitConversionTokenFilterFactory extends TokenFilterFactory {

* @param args
public UnitConversionTokenFilterFactory(Map<String, String> args) {
if (!args.isEmpty()) {
throw new IllegalArgumentException(“Unknown parameters: ” + args);

/* (non-Javadoc)
* @see org.apache.lucene.analysis.util.TokenFilterFactory#create(org.apache.lucene.analysis.TokenStream)
public TokenStream create(TokenStream input) {
return new UnitConversionFilter(input);



NOTE: When you override the TokenFilter and TokenFilterFactory, make sure to edit the protected constructors to public, otherwise it will throw NoSuchMethodException during plugin init.

Now, compile and export your above classes into a jar say customUnitConversionFilterFactory.jar

Steps to Deploy Your Jar Into Solr

1. Place your jar file under /lib

2. Make an entry in solrConfig.xml file to help it identify your custom jar.

[code language=”xml”]

<lib dir=”../../../lib/” regex=”.*\.jar” />


3. Add custom fieldType and field in your schema.xml

[code language=”xml”]

<field name=”unitConversion” type=”unitConversion” indexed=”true” stored=”true”/>
<fieldType name=”unitConversion” class=”solr.TextField” positionIncrementGap=”100″>
<tokenizer class=”solr.KeywordTokenizerFactory”/>
<filter class=”com.solr.custom.filter.test.UnitConversionTokenFilterFactory” />

4. Now restart Solr and browse to the Solr console//documents

5. Add documents in your index like below:


6. Query your index.

Query1 : querying for documents with 1kg




Query2: querying for documents with 2kg




Query3: let’s try faceting



This is just a basic implementation. One can add additional fields to identify the type of unit and then based on that decide the conversion.

Further improvements include handling of range queries along with the units.

For more info check us out in Social Media, we were recently able to Buy Instagram likes to improve our account.

Edge Side Includes (ESI)

Traditionally web applications were meant to serve static content, wherein the server generated an HTML response to the client’s request (typically HTTP) and sent it back to the client. The response was then rendered on user’s screen (browser window) by the client (browser). In order to increase user-perceived performance, such static content was cached at the edge of Internet so that it could be served faster. Content Delivery Networks (CDNs) have been used commercially for such purposes.

Using Revealing Module Pattern with Web Interface Design

I recently joined a project that required me to develop a mobile website for the client. We already had a fully functional desktop version of the website. There were around 10000+ lines of JavaScript code for the existing desktop website and most of it was not at all reusable for the mobile website. The purpose of this blog post is to identify what went wrong with the original design of the JavaScript code and to provide a solution that could have saved around two to three weeks of (re)work.

The Problems

Following are some issues that made us re-implement the complete functionality from scratch. Later the approach and guidelines that we implemented to avoid the same problem again will be discussed.

JQuery (mis)usage
Many methods implementing some business logic were used $(‘#id’) to access values. To reuse this code, I’d have had to use exactly same id/classes for elements.

Too many document.ready()
Each script included on the page had its own document ready handler. Multiple entry points made it hard to determine the flow of code.

Modular approach
The whole JavaScript code was organized as global functions. Need I say more?

Code Repetition
At some places, very similar or even the same functionality was implemented in different functions. Validation and error handling code was also repeated at a number of places.

Repeated element ids

Same elements id were used at a number of places.

What Affects What?
It was hard to determine which piece of code was intended for which part of the page. Some JavaScript files were being used on multiple pages. There each page was broken into several jspx files that may or may not be included on the page depending on some business logic. Also, some of those jspx had conditions of their own to add or skip some elements.

The Required Solution

Now that we have looked at the major problems with the existing JavaScript code, let’s take each of the problems and see what we can do to avoid them.
JQuery (mis)usage
To fix this, we needed to separate the business logic from UI logic. E.g. code to update user information on server should not be defined in the click handler of the submit button. We should, instead, put the ajax call in a separate function, taking the user details as parameters. The click handler should gather user details from form elements and pass them to the function. Similarly, use callbacks to update UI when a response is received from the server.

In UI code, we should be able to somewhere define ids of various form fields instead of hard coding field ids everywhere.

Too many document.ready()
There should be only one script with document ready handler, and it should be responsible to initialize various modules used on the page.
Modular approach & What affects what?
We needed an object oriented approach with business and UI logic in separate modules. Also, we needed some sort of mapping between the UI modules and jspx files. This would make it easier to determine exactly what code needs to be updated if something changes on the UI.
Code repetition
We needed to organize modules neatly. The business logic modules should have some sort of hierarchy based on functionality. The UI modules should maintain same hierarchy as the back end jspx files.
The validation and error handling code should be moved outside business and UI logic, in separate modules of their own.
Repeated element ids
It was simple to use unique ids in a single jspx file. However, on a completely rendered page we could still have repeated ids, especially when a jspx file was included more than once on a page. So, while inserting a jspx file, we also needed to wrap it inside a div container with unique id. We also needed some approach to be able to identify each element without specifying complete id-hierarchy.

Coding/Design Guidelines

To implement the changes suggested in the previous section, I had to write some utility modules to handle things such as validation, error handling, a JQuery wrapper to transparently access DOM elements with full id etc.

Following are some design and coding guidelines that I followed while implementing the mobile website.

Jspx Design Guidelines

  • Every id in a jspx file should be unique.
  • Every jspx file should be included inside a div container of its own; the container should be given a unique id in the jspx file.
  • All form fields should use a common markup style at all places e.g. each field should be placed inside a div container with proper classes such as ‘textbox’, ‘disabled’, ‘read-only’ etc

JavaScript coding guidelines

  • Organize the code in some meaningful namespaces, clearly separating business logic, UI logic and form validation and error handling code.
  • None of the code in business logic must interact with UI. Any of the business logic functions must accept required data and callback functions as arguments.
  • The business logic modules should be singletons unless it is required to create multiple objects of the module.
  • A UI module should be created for each jspx file.
  • For each jspx file included, a module must create objects of each of the module respective to included jspx files.
  • Any HTML element should be accessed only by its respective module.
  • While creating a child UI module, the parent module must pass the full id of its child jspx’s container div to the child module.
  • Define all form fields and HTML elements of the respective jspx accessed in the module.
  • While accessing any field by id in a UI module, the field id should be appended to the full id of the container div passed by the parent module. (We implemented a utility module to transparently handle this).
  • Define callbacks in each UI module for any event that might be interesting to the parent module.  e.g. submitButtonClicked, someCheckBoxChecked etc.
  • In case a child module needs to update parent module for some event, it must do so via callbacks. The parent module needs to set these callbacks at the time of creating the child module object.
  • Define public functions for actions requested by parent module. e.g. enableSubmitButton(), showSomeHiddenElement() etc.
  • In case a parent module needs to update UI for a child module, it must do via public functions exposed by the child module.
  • Don’t access HTML element attributes directly. For example, to hide elements don’t change display style. Create CSS classes for each required state of HTML elements and add/remove these classes to update any
  • Avoid creating id based CSS rules and using classes in JavaScript.
Summing up everything, the following is what we achieved by using the new approach for developing the mobile website:
  • The business logic is now completely separate from the UI code, and can be used independent of UI code.
  • The UI code has direct mapping with the jspx files. If a jspx file is modified, only its respective is modified.
  • Since all fields and HTML elements are defined in each module, changing ids of fields in jspx file requires just a single update in its module.
  • All JQuery dependent code has been moved to the ModuleHelper module. Although it might still require a lot of other changes, replacing JQuery with any other library requires minimal UI code changes.
  • Each HTML element is now uniquely identifiable by its module, even if the jspx file gets included multiple times on the same page.
  • Form validations and error handling are now completely removed from the UI code.

Further Reading:

Following are a few links for further reading on some basic JavaScript features and design patterns:
Functions in JavaScript are First Class Objects
JavaScript Closures
JavaScript Design Patterns


Building Web Services with JAX-WS : An Introduction

Developing SOAP based web services seem difficult because the message formats are complex.  JAX-WS API  makes it easy by hiding the complexity from the application developer. JAX-WS is the abbreviation for Java API for XML Web Services. JAX-WS is a technology used for building web services and clients that communicate using XML.

JAX-WS allows developers to write message-oriented as well as RPC-oriented web services.

A web service operation invocation in JAX-WS is represented by an XML-based protocol like SOAP. The envelope structure, encoding rules and conventions for representing web service invocations and responses are defined by the SOAP specification. These calls and responses are transmitted as SOAP messages over HTTP.

Though SOAP messages appear complex, the JAX-WS API hides this complexity from the application developer.  On the server-side, you can specify the web service operations by defining methods in an interface written in the Java programming Language. You need to code one or more classes that implement these methods.

It is equally easy to write client code. A client creates a proxy, a local object representing the service and then invokes methods on the proxy. With JAX-WS you do not generate or parse SOAP messages. Thanks to JAX-WS runtime system that converts the API calls and responses to and from SOAP messages!

With JAX-WS, web services and clients have a big advantage which is the platform independence of the Java programming language. Additionally JAX-WS is not restrictive: a JAX-WS client can access a web service that is not running on the Java platform and vice versa. This flexibility is possible because JAX-WS uses technologies defined by the World Wide Web Consortium – HTTP SOAP and the Web Service Description Language or WSDL. WSDL specifies an XML format to describe a service as a set of endpoints operating on messages.

Three most popular implementations of JAX-WS are:

Metro: It is developed and open sourced by Sun Microsystems. Metro incorporates the reference implementations of the JAXB 2.x data-binding and JAX-WS 2.x web services standards along with other XML-related Java standards. Here is the project link: You can go through the documentation and download the implementation.

Axis: The original Apache Axis was based on the first Java standard for Web services which was JAX-RPC. This did not turn out to be a great approach because JAX-RPC constrained the internal design of the Axis code and caused performance issues and lack of flexibility. JAX-RPC also made some assumptions about the direction of Web services development, which turned out to be wrong!

By the time the Axis2 development started, replacement for JAX-RPC had already come into picture. So Axis2 was designed to be flexible enough to support the replacement web services standard on top of the base framework. Recent versions of Axis2 have implemented support for both the JAXB 2.x Java XML data-binding standard and the JAX-WS 2.x Java web services standard that replaced JAX-RPC(JAX-RPC: Java API for XML-based Remote Procedure Calls. Since RPC mechanism enables clients also to execute procedures on other systems, it is often used in a distributed client-server model. RPC in JAX-RPC is represented by an XML-based protocol such as SOAP). Here is the project link:

CXF: Another web services stack by the Apache. Though both Axis2 and CXF originated at Apache, they take very different approaches about how web services are configured and delivered. CXF is very well documented and has much more flexibility and additional functionality if you’re willing to go beyond the JAX-WS specification. It also supports Spring. Here is the project link:

At Talentica we have used all the three implementations mentioned above in various projects.

Why CXF is my choice?
Every framework has its strengths but CXF is my choice. CXF has great integration with the Spring framework and I am a huge fan of Spring projects. It’s modular and easy to use. It has great community support and you can find lots of tutorials/resources online. It also supports both JAX-WS and JAX-RS specification but that should not be a considering factor while choosing CXF. Performance-wise it is better than AXIS and it gives almost similar performance when compared to Metro.

So CXF is my choice, what is yours?