VPC Sharing Using AWS RAM (Resource Access Manager)

Over the years AWS has made managing multi-account AWS environments easier. They have introduced consolidated billing, AWS Organizations, cross-account IAM roles delegation, and various ways to share resources like snapshots, AMIs, etc.

In this blog post, I will discuss cross-account VPC sharing using AWS RAM which is a cool new service launched by AWS in November 2018. AWS RAM enables us to share our resources with an AWS account or through AWS Organizations. If you have multiple AWS accounts, you can create resources centrally and use AWS RAM to share those resources with other accounts.

VPC sharing is a very powerful concept with many benefits:

  • Separation of duties: centrally controlled VPC structure, routing, IP address allocation.
  • Application owners continue to own resources, accounts, and security groups.
  • VPC sharing participants can reference security group IDs of each other.
  • Efficiencies: higher density in subnets, efficient use of VPNs and AWS Direct Connect.
  • Hard limits can be avoided, for example, 50 VIFs per AWS Direct Connect connection through simplified network architecture.
  • Costs can be optimized through reuse of NAT gateways, VPC interface endpoints, and intra-Availability Zone traffic.

AWS RAM gives us the provision to share following services till date:

  • Subnet
  • Transit Gateways
  • Resolver Rules
  • License Configuration


When you share a resource with another account, then that account is granted access to the resource. Any policies and permissions in that account apply to the shared resource

I will now share subnets from the account (A) which will be the owner account to account (B), say participant account.

Setting up AWS organization:

Create an AWS organization in account A and add the participant account B in the Organization.

Invite the account B in the AWS organization by sending a request from the console.



Create a Custom VPC and few subnets in the owner account which will be shared with the participant account.



Next, enable the resource sharing for your organization from the AWS Resource Access Manager settings in account A.


Now let’s start with resource sharing by creating a resource share in “shared by me tab”.


After providing a description for the shared resource, select “Subnets” in the resource tab and then go ahead and select the subnets which you wish to share with participant account.



The principal will be the destination account or the AWS Organization to which the subnets will be shared. I will go with AWS organization and select account B in the organization.



After creating the resource share in owner account A, go to the participant account B and check if the resource share is visible in AWS RAM dashboard “shared with me” tab.



The shared subnets will now appear in the participant account B along with the VPC.



Let’s use this VPC to launch resources in Participant account. Navigate to the EC2 dashboard and while launching the instance, in the configure instance section check the availability of shared VPC and subnets.



Voila! The magic is done!

Things to know:
  • At this moment VPC sharing is only available within the same AWS Organization.
  • We cannot share default VPC’s.
  • Participants accounts can’t launch resources using security groups that are owned by other participants or the owner.
  • Participants can’t launch resources using the default security group for the VPC because it belongs to the owner.
  • Participants pay for their resources and also pay for data transfer charges associated with Inter-Availability Zone data transfer, internet gateway, VPC peering connections, and data transfer through an AWS Direct Connect.
  • VPC owners pay hourly charges (where applicable), data processing and data transfer charges across NAT gateways, virtual private gateways, transit gateways, AWS PrivateLink, and VPC endpoints.

AWS ECS (Amazon Elastic Container Service )

In this blog, I will try to cover the following topics and try to explain more about AWS Elastic Container Service which is a highly scalable, fast and high-performance container management service.

  • Why Docker Containers?
  • ECS Cluster Management
  • EC2 Container Registry
  • ECS Services
  • Auto-Scaling in ECS
  • Monitoring, Logging and Notification

Why Docker Containers?

  • Lightweight, Open Source and Secure
  • Portable and efficient in comparison to VM
  • Empower Developer creativity
  • Eliminates Environmental Inconsistencies
  • Ability to scale quickly
  • Reduces time to market of your application

Services evolve to microservices


Why Container Cluster Management System is needed?

  • Provides clustering layer for controlling the deployment of your containers onto the underlying hosts
  • Manages container lifecycle within the cluster
  • Scheduling Containers across the cluster
  • Scaling containers

What is AWS ECS (EC2 Container Service)?

  • Amazon EC2 Container Service (ECS) is a highly scalable, fast and high performance container management service.
  • Easily run, stop and manage Docker containers on cluster of Amazon EC2 instances.
  • Schedules the placement of Docker containers across your cluster based on resource needs, availability and requirements.

Components of ECS

  • Cluster – Logical group of container instances
  • Container Instance – EC2 instance in which ECS agents runs and is registered to cluster.
  • Task Definition – Description of application to be deployed
  • Task – An instantiation of task definition running on container  instance
  • Service – Runs and maintains predefined tasks simultaneously
  • Container – Docker Container created during task instantiation

ECS Architecture Overview

Key Components of ECS Architecture

Agent Communication Service – Gateway between ECS agents and ECS backend cluster management engine

API – Provides cluster state information

Cluster Management Engine – Provides cluster coordination and  state management

Key/Value Store – It is used to store cluster state information

ECS Agent –

  • It runs on EC2(Container) instances
  • ECS cluster is collection of EC2(Container) Instances
  • ECS agent is installed on each of EC2(Container) Instances
  • ECS agent registers instance to centralised ECS service
  • ECS agent handles incoming requests for container deployment
  • ECS agent handles the lifecycle of container

EC2 Container Registry (Amazon ECR)

  • It is an AWS managed Docker container registry Service.
  • Stores and Manages Docker Images
  • Hosts images in a highly available and scalable architecture
  • It is integrated with ECS.
  • No upfront fee, cheap and pay only for the data stored.



Creating ECS Cluster

Cluster can be created using

  • AWS Console (Manual method)
  • AWS ECS CLI (Manual method)
  • Cloud Formation Template (IAC and Recommended method)

Cloud Formation Example

aws cloudformation create-stack –stack-name dev-ecs-stack –template-body file://master.yaml –parameters file://parameter_dev.json –capabilities CAPABILITY_IAM

ECS Task Definition

Task Definition is similar to docker-compose.

Task definition can consist 1 or more container definitions

It defines

  • Docker Images to use
  • Port and Drive Volume Mapping
  • CPU and memory to use with container
  • Whether containers are linked
  • Environmental variable which is required to be passed to container.

ECS services 

  • Allows you to run and maintain a specified/desired number of tasks.
  • If any task fails or stop for any reason, ECS service scheduler launches another task of your task definition to maintain desired task count.

Deploying ECS Cluster

  • Create Security groups at instance and load balancer level.
  • Create an Application Load Balancer
  • Create a Launch configuration with ECS optimised AWS AMI
  • Create a Autoscaling group, which specifies the desired number of instances
  • Create a task definition
  • Create a target group and ecs service

Sample ECS architecture

ECS Instance Level Auto Scaling

ECS provides cluster-level parameters which can give the cluster utilization Statistics

  • Memory Reservation – Current % of reserved memory by  cluster
  • Memory Utilization – Current % of utilized memory by cluster
  • CPU Reservation – Current % of reserved CPU by cluster
  • CPU Utilization – Current % of utilized CPU by cluster

CloudWatch Alarms on the above parameters enables to Scale Up/Down the ECS cluster

ECS Service Level Autoscaling

  • ECS also provides the facility to scale up/down the number of tasks in the service.
  • Tasks can be autoscaled on following ECS service parameters
    • CPU Utilization – Current % CPU utilization by ECS service
    • Memory Utilization – Current % Memory Utilization by ECS 

CloudWatch Alarms on the above parameters enables to Scale Up/Down the service.

ECS Auto Scaling Overview

Monitoring and Logging


  • Use Cloudwatch Logs to centralized all container service logs
  • Follow “ecs/stackname/servicename” Log Group Format.
  • Get notification in slacks channel about the Cloudwatch ECS Alarms and Events via AWS Lambda function.




Key Advantages of ECS Service

  • Easy Cluster Management – ECS sets up and manages clusters made up of Docker containers. It launches and terminates the containers and maintains complete information about the state of your cluster.
  • Auto Scaling – Instance as well as Service level.
  • Zero-downtime deployment – service updation follows Blue-Green deployments.
  • Resource Efficiency – A containerized application can make very efficient use of resources. You can choose to run multiple, unrelated containers on the same EC2 instance in order to make good use of all available resources.
  • AWS Integration – Your applications can make use of AWS features such as Elastic IP addresses, resource tags, and Virtual Private Cloud (VPC)
  • Service Discovery – used for internal Service to service communication.
  • Fargate technology – automatically scale, load balance, and manage scheduling of your containers.
  • Secure – Your tasks run on EC2 instances within an The tasks can take advantage of IAM roles, security groups, and other AWS security features.

Key Challenges of ECS Service

  • Supported by only AWS.
  • Application level custom monitoring is not available.


Using Custom Metrics for CloudWatch Monitoring

AWS can dig a crater in your pocket (if not you, then your client’s). Also, post-downtime meetings with clients can get sour for the right metrics going unmonitored.

I have been working with AWS since a while now and have learned it the difficult way that just spinning up the infrastructure is not enough. Setting up monitoring is a cardinal rule. With the proliferation of cloud and microservice-based architecture, you cannot possibly gauge the usage, optimize the cost or ascertain when to scale up or scale down without monitoring.

This is not a post on why monitoring is required but rather on why and how to enhance your monitoring using custom metrics for AWS-specific infrastructure on CloudWatch. While CloudWatch provides ready metrics for CPU, network bandwidth—both in and out, disk read, disk write and a lot more it does not provide memory and disk metrics. And, considering you are reading this post on custom metrics, you already know that monitoring just the CPU without memory and disk is simply not enough.

Why doesn’t AWS provide CPU and Disk Metrics by default like it provides the rest?

Well, CPU metrics, network metrics for EC2 can be fetched externally, while for monitoring memory and disk access to the servers is required. AWS does not have access to your servers by default. You need to be inside the server and export the metrics at regular intervals. This is what I have done as well to capture the metrics.

The following are the custom metrics we should monitor:
• Memory Utilized (in %)
• Buffer Memory (in MB)
• Cached Memory (in MB)
• Used Memory (in MB)
• Free Memory (in MB)
• Available Memory (in MB)
• Disk Usage (in GB)

Why did I create these playbooks? Why use custom metrics to monitor?

• Memory metrics are not provided by AWS CloudWatch by default and require an agent to be installed. I have automated the steps to install the agent and added a few features.
• The base script provided by Amazon didn’t output some metrics to be exported to CloudWatch like buffer memory and cached which didn’t give a clear picture of the about the memory.
• There were times when the free memory would indicate 1–2GB but the cached/buffer would be consuming that memory and not release it, thereby depriving your applications of memory.
• Installing the agent on each server and adding to the cron was challenging. Especially if you frequently create and destroy VMS. Why not just use Ansible to install it in one go to multiple servers?

So how do we set up this monitoring?

It’s fairly simple:
1. Install Ansible on machine / local
2. Clone the repo https://github.com/alokpatra/aws-cloudwatch.git
3. $ git clone https://github.com/alokpatra/aws-cloudwatch.git
4. Populate the host file with the servers details you want to monitor
5. Allow CloudWatch access to EC2 by attaching an IAM role the target hosts you want to monitor. To attach a role go to the Instance section of the AWS Console. Select the instance > Click on Actions > Instance Settings > Attach/Replace IAM Role
6. Run the playbook
7. Create your own custom dashboards on AWS CloudWatch Console.
While I have detailed out the ReadMe in the GitHub repo, I’ll just discuss a few things in brief here:

What do the scripts do precisely?

Well, the scripts are an automated and improvised version of the steps to deploy the CloudWatch agent. Since I had extensively used Ansible previously, I wrote an Ansible role to simplify deployment of multiple server agents.
The Ansible role does the following:
• Identifies the distribution
• Installs the pre-requisites as per the OS flavor
• Installs the prerequisite packages based on the distribution
• Copies the Perl scripts to the target machine.
• Sets the cron job to fetch and export the metrics at regular intervals (default of 5mins)

Minor changes to the Perl script have been made also to export the cached and buffer memory which I found quite useful.

Supported OS Versions

• Amazon Linux 2
• Ubuntu


1. Ansible to be installed on the Host Machine to deploy the scripts on the target machines/servers. I have used Ansible Version 2.7.
•  To install ansible on Ubuntu you can run the following commands or follow this link
$ sudo apt update $ sudo apt install software-properties-common $ sudo apt-add-repository ppa:ansible/ansible $ sudo apt update $ sudo apt install ansible“`
•  On Amazon Linux 2 you need to run the following commands, obviously, there is no Digital Ocean Guide to follow
$ sudo yum-config-manager — enable epel $ yum repolist ( you should see epel) $ yum install ansible
2. CloudWatch access to Amazon EC2. The EC2 instances need to have access to push metrics to CloudWatch. So you need to create an IAM role ‘EC2AccessToCloudwatch’ and attach the policy to allow ‘write’ access for EC2 to CloudWatch. Now attach this IAM role to the target hosts you want to monitor. In case you already have a role attached to the instance, then add the above policy to that role.
The other alternative is to export the keys to the servers. (Playbooks are not updated for this option yet). I have used the IAM option which avoids the need to export keys to the server which can often be a security concern. It is also difficult to rotate the credentials subsequently.
3. SSH access to the target hosts i.e. the hosts where you want the agent installed since Ansible uses SSH to connect to managed hosts.

What do the playbooks exactly do?

• Identify the Distribution
• Install the pre-requisites as per the OS flavor
• Install the prerequisite packages based on the distribution
• Copy the Perl scripts to the target machine
• Set the cron job to fetch and export the metrics at regular intervals (default of 5mins)

How to run the playbooks?

• Populate the inventory/host file with the hosts/IPs. A sample host file is present in the repo
• Run the main playbook with the following command which in turn calls the role CloudWatch-agent $ ansible-playbook -i hosts installer.yaml -vv“`

This would install the agent. Now you can go ahead and create the Dashboards on CloudWatch

Below is a sample Dashboard I have created. You might want to customize the widgets as per your requirement.

The below dashboard has 2 widgets:
i. It gives a high-level picture of the overall Memory Utilization Percent. This tells you which server is running on high memory and the spikes if any.
ii. Free Memory (in MB). This is read after the one above to obtain free memory on a particular server you see has high utilization.

Before the next Dashboard which is a level deeper, let’s just glance at the Total Memory Pie Chart.

The following Dashboard is to dig deeper into the memory metrics. To understand the exact distribution of memory and where it is consumed.

Hope the post was useful.
Happy Monitoring!

Basics of Ansible and Installation

What is Ansible?:


Ansible is an open source software that automates software provisioning, configuration management, and application deployment. Ansible connects via SSH, remote PowerShell or via other remote APIs.


How Ansible works?:


Ansible works by connecting to your nodes and pushing out small programs, called “Ansible modules” to them. These programs are written to be resource models of the desired state of the system. Ansible then executes these modules (over SSH by default) and removes them when finished


Key Features of Ansible: 

  • Models the IT infrastructure around the systems interrelating with each other, thus ensuring faster end results.
  • Module library can reside on any system, without the requirement of any server, daemons or databases.
  • No additional setup required, so once you have the instance ready you can work on it straight away.
  • Easier and faster to deploy as it doesn’t rely on agents or additional custom security infrastructure.
  • Uses a very simple language structure called playbooks. Playbooks are almost similar to the plain English language for describing automation jobs.
  • Ansible has the flexibility to allow user-made modules that can be written in any programming language such as Ruby, Python. It also allows adding new server-side behaviours extending Ansible’s connection types through Python APIs.


Terms in Ansible:


  • 1) Playbooks

Playbooks express configurations, deployment, and orchestration in Ansible. The Playbook format is YAML. Each Playbook maps a group of hosts to a set of roles. Each role is represented by calls to Ansible tasks.



  • 2) Ansible Tower

Ansible Tower is a REST API, web service, and web-based console designed to make Ansible more usable for IT teams with members of different technical proficiencies and skill sets. It is a hub for automation tasks. The Tower is a commercial product supported by Red Hat, Inc. Red Hat announced during AnsibleFest 2016 that it would release Tower as open source software

Ansible Architecture:


(On AWS EC2 Linux Free Tier Instance, python and ssh both are already installed)

  • Python Version — 2.7.13
  • Three servers
  • Ansible control Server ( Install ansible using epel repository)- On AWS you have to enable this file
  • WebServer
  • DBServer


How to connect between these servers?

To ping these servers(webserver and dbserver) from ansible control server, you have to add one inbound rule “All ICAMP traffic” in both the instances)

  • Ansible Control Server
  • Install Ansible on Redhat

wget http://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm

rpm -ivh epel-release-latest-7.noarch.rpm

yum repolist

yum — enablerepo=epel install ansible

  • Install Ansible on AWSLinux
vim /etc/yum.repos.d/epel.repo


sudo yum-config-manager --enable epel

yum repolist ( you should see epel)

yum install ansible

Create an entry for all servers in etc/hosts file as shown below

vim etc/hosts

Create one user “ansadm” on all the servers as shown below

After adding you have to do ssh by login as ansadm user. You will get the below error because ssh is not set up yet

How to Setup SSH

  • Generate ssh key on ansible control server.
  • https://www.youtube.com/watch?v=5KmQMfEqYxc
  • ssh-keygen on ansible control server by login on ansadm ( ssh is user specific)
  • This will create .ssh folder (/home/ansadm/.ssh)
  • Create an authorized_keys on both the servers and copy the key from ansible control server as shown below


[ansadm@ip-172–31–21–35 ~]$ ssh-copy-id -i ansadm@
 /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: “/home/ansadm/.ssh/id_rsa.pub”
 /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed

/usr/bin/ssh-copy-id: WARNING: All keys were skipped because they already exist on the remote system.
  (if you think this is a mistake, you may want to use -f option)

[ansadm@ip-172–31–21–35 ~]$ ssh ansadm@
 Last login: Thu Jan 11 13:34:31 2018

__| __|_ )
  _| ( / Amazon Linux AMI

 [ansadm@ip-172–31–19–214 ~]$ exit

Now all three servers are configured, ansible control server can do ssh on both the servers

Change the ownership of etc/ansible folder to ansadm

chown -R ansadm:ansadm /etc/ansible

vim etc/ansible/hosts


ansible.cfg file ( This is an inventory file)

Ansible commands ( We can run all commands only on the control server and all other servers are managed by it)

To install any package you have to be root. So we are making ansadm of the controller as a root user on all machines (except controller)

vi /etc/sudoers


Now run the same command with -s option

Ansible Roles

Roles are the next level of abstraction of ansible playbook. Roles are the list of commands that ansible will execute on target machines in given order

Playbook — decides which role is for which target machine

[ansadm@ip-172–31–21–35 ansible]$ mkdir roles/basic
[ansadm@ip-172–31–21–35 ansible]$ mkdir roles/basic/tasks
[ansadm@ip-172–31–21–35 ansible]$ cd roles/basic/tasks
[ansadm@ip-172–31–21–35 tasks]$ vi main.yml

[ansadm@ip-172–31–21–35 ansible]$ cat /etc/ansible/roles/basic/tasks/main.yml

- name: Install ntp
 yum: name=ntp state=present
 tags: ntp

[ansadm@ip-172–31–21–35 ansible]$ vi playbook.yml
[ansadm@ip-172–31–21–35 ansible]$ ansible-playbook -K playbook.yml

[ansadm@ip-172–31–21–35 ansible]$ cat playbook.yml
- hosts: all
— role: basic

ansible-playbook <playbook> — list-hosts

To check if HTTPd is installed, the easiest way is to ask rpm:

rpm -qa | grep httpd
  • Verify the playbook for syntax errors:

#ansible-playbook file_name.yml –syntax-check

  • To see what hosts would be affected by a playbook

#ansible-playbook file_name.yml –list-hosts

  • Run a playbook

# ansible-playbook file_name.yml



Ansible is easy to learn. Managing resources using Ansible can be extremely efficient and easy. Here we learn about Ansible basic concept, Installation steps and different features.

Spot Fleet Termination and Remove Stale Route53 Entries

When it comes to Cluster management, we may need many slaves (machines) which will run our tasks/applications. Similarly, in our project, we do have Apache Mesos Cluster which nearly runs around 200 EC2 instances in production itself and for real-time data processing, we use Apache Storm cluster which has supervisors machine count around 150. All these machines rather than running on on-demand we run on Spotfleet in AWS.

Now the question comes what is spotfleet? To understand SpotFleet first look what is Spot Instances.

Spot Instances

AWS must maintain a huge infrastructure with a lot of unused capacity. This unused capacity is basically the available spot instance pool – AWS lets users bid for these unused resources (usually on a significantly lower price than the on-demand price). So we can get AWS ec2 boxes at a much lower price as compared to their on-demand price.


A Spot Fleet is a collection, or fleet, of Spot Instances, and optionally On-Demand Instances. The Spot Fleet attempts to launch the number of Spot Instances and On-Demand Instances to meet the target capacity that you specified in the Spot Fleet request.

Above is the screenshot of AWS SpotFleet, in which we are launching 20 instances.

Spot instance lifecycle:

  • User submits a bid to run the desired number of EC2 instances of a particular type. The bid includes the price that the user is willing to pay to use the instance.
  • If the bid price exceeds the current spot price (that is determined by AWS based on current supply and demand) the instances are started.
  • If the current spot price rises above the bid price or there are no available capacity, the spot instance is interrupted and reclaimed by AWS. 2 minutes before the interruption the internal metadata endpoint on the instance is updated with the termination info.

Spot instance termination notice

The Termination Notice is accessible to code running on the instance via the instance’s metadata at This field becomes available when the instance has been marked for termination and will contain the time when a shutdown signal will be sent to the instance’s operating system.

The most common way discussed to detect that the Two Minute Warning has been issued is by polling the instance metadata every few seconds. This is available on the instance at:

This field will normally return a 404 HTTP status code but once the two-minute warning has been issued, it will return the time that shutdown will actually occur.

This can only be accessed from the instance itself, so you have to put this code on every spot instance that you are running. A simple curl to that address will return the value. You might be thinking to set up a cron job, but do not go down that path. The smallest interval you can run something with cron is once a minute. If you miss it by a second or two you are not going to detect it until the next minute and you lose half of the time available to you.

 Sample Snippet:

Below is alert that we receive in SLACK.

Delete Route53 Entries

We do create DNS entries for all our mesos and storm boxes, but whenever those instances get deleted their DNS entries still remain there, which cause lots of entries under Route53 which are of no use. So, we come up with an idea why not have a lambda function that will be triggered whenever a spotfleet instance got terminated.

Created a cloudwatch rule

So, whenever an instance got terminated this rule run and it triggers a Lambda function, which deletes the route53 of the terminated instance.

Below is the code snippet:

For each mesos and storm Instance, we do have Tagging and that tagging we use to destroy entries


From Spot instances Termination we get a two-minute warning that they are about to be reclaimed by AWS. These are precious moments where we can have the instance deregister from accepting any new work, finish up any work in progress. Apart from these,  using lambda function we can remove stale route53 entries.

Automation frenzy deployment using OpsWorks, Jenkins and AWS CLI

What is deployment: Deployment is what developers want the DevOps team to carry out.

They (Developers) code out some nice stuff which the above-mentioned people (DevOps) are responsible to handle the code with care and pass it to the production servers so the code is hosted in some way. Trust me it just sounds easy.

It’s an era of cloud so let’s take examples of cloud services especially taking AWS into account (we love AWS here in Talentica). So consider that your Infrastructure or servers shall be on AWS.

Now when setting up your infra (the platform to host your code) you shall chalk out a plan, a process to construct an architecture over it but what if you don’t know stuff about how infra works!

Well, there is this service AWS OPSWORKS which is quite simple to understand.
So OPSWORKS is based on another tool “chef-server”. You don’t need to know much about chef for using AWS OPSWORK though, still, Chef is a configuration management tool which follows idempotency and it is mostly scripted in Ruby language.

So, whatever you want the chef tool to run can be called as a “recipe”.
The recipe shall contain definitions (block of code) for what do you want your server to work upon.

Here is an example of a recipe to make the talking make some sense:

user "Add a user" do
   home "/home/joe"  
   shell "/bin/bash"
   username "joe"  

Ruby code generally looks like “do … end”

Here the OpsWorks will fetch this block of code from wherever you want to say from local machine or by pulling it from GitHub/bitbucket directly on the server etc and then it will read the ruby code and it will create a user with username as “joe”.

Let us complicate things a bit-

Consider your infra /platform which hosts your website or whatever block of code on the INTERNET looks something like this:


Here on the bottom, there is database server (we use AWS RDS). The second bottommost level is a herd of servers. Well technically we call it as a cluster but that’s okay. But the catch is all the servers are duplicated, they all do the same thing. Why? Because we are” load balancing” between the servers that’s why.

NOW as I said let’s complicate things. Imagine you want the same architecture just for testing, you want to test your code first in one environment (consider the whole image above as one environment) So now we have two images same as above but in QA there are no users and it is not available directly on the internet (don’t ask why ! )  .
One fine day we need to add other environments for some other purpose call it as loadtest to check how much load our architecture can withstand. And we also want another environment for the demo env which shall be always available as we need to show a demo website which is not like QA also not like production.

This will create a problem statement  “How am I supposed to handle this alone!

This is the part of the blog where we will have to dive deep into what’s OpsWorks.

In Opsworks:

STACKS: So, stack in the Infra sense is what all components are used in your environment. Talking about multiple environments as in our pictorial example we shall have 4 stacks named as prod-stack, QA-stack,  demo-stack and loadtest-stack.

How to create them??? It’s easy to go to AWS CONSOLE -> SEARCH FOR OPSWORKS -> get started -> create stacks. I don’t want this blog to be that full of images. If you face any problem to do so maybe you can mail me.

LAYERS: wondering when will Jenkins come up in this blog? Wait!
So, talking about layers, I need to give a good example here.

Layers are the actual components within the stacks, they can also be categorized as applications.

Let’s go back to our stack scenario diagram where we will have a slight difference this time.

Picking up the “QA” environment. Here we have two applications hosted under the QA environment where QA is the “stack“, qa.abc.test.com and qa.test.com are the “layers”.

So here qa.abc.test.com is a branch of QA environment which hosts somewhat different code as compared to the main domain (qa.test.com)

Hence for different code, there shall be different hosting space, different deployment space, perhaps different servers.

Here is what AWS document says about what is OpsWork layers:
Every stack contains one or more layers, each of which represents a stack component, such as a load balancer or a set of application servers.

So, when we desire to make such kind of architecture OpsWork helps us a lot. It is very easy to simply add up layers.

Here is an example of how easily we can add layers just a click here a dropdown menu select there and you are done. You can integrate many other AWS services such as ECS with Opsworks. We can add a DB (RDS panel) as a layer too.
Opsworks nicely segregates the layers but also keeps them together in a single stack.

Deploying on the layers
1. BOOT-TIME Deployment
2. Code-Deployment

BOOT-TIME Deployment

Boot time deployment is basically what stuff do you want to get installed when a new server (instance) is booted up.  The stuff you specify will be installed on the new server at every boot up. You don’t need to install it again and again after launching new instances.
The installations are nothing but recipes running on the servers.

You wish to install apache server on all the servers in a particular layer or in the whole stack then write a ruby recipe (like we did for adding user) for installing apache server and boot up an instance to check if the apache is installed already.

There can be many such important recipes you will need to run on a newly launched instance.


JENKINS: Here is its entry.
Jenkins is a Continuous Integration Continuous Delivery tool.
It carries out amazing deployments smoothly. Here we are using Jenkins and Opsworks in a manner that Jenkins shall know these many points
1. How many stacks are present
2. How many layers are present in a stack
3. How much instances(servers) are present in the layers.
4. Which layer to deploy
5. Which layer not to deploy
6 PublicIP address of the instances on which we need to run the code-deploy

To know all these points, we have/ (require if you don’t have) AWS CLI (command line interface). Whereby using particular aws query command we get JSON outputs.

Jenkins can run these aws command and can get json outputs. Let’s take an example.

“AWS OpsWorks describe-stacks” This command will give a json output with the names of our stacks such as QA stack etc.  Suppose you want to deploy on qa stack then from that json output we will simply have to fetch the “QA-stack “. We can use “jq” to parse the json outputs, here is an example:

OPS_STACK_ID=$(aws opsworks describe-stacks | jq '.Stacks [] | select(.Name=="'"$ENV-stack"'")'| jq '.StackId' | cut -d '"' -f2)

(covers up point number 1)

here we will get the stack-id of our $env-stack where $env is a variable which is set as “QA” (parameterized in Jenkins). SO basically, Jenkins has got the QA-stack’s ID.

Jenkins will further describe this QA-stack ahead to get the json output for how many layers are present in this stack.

OPS_LAYER_ID=$(aws opsworks --region us-east-1 describe-layers --stack-id $OPS_STACK_ID |jq '.Layers[] | select (.Name=="'"qa.abc.test-layer"'")'| jq '.LayerId')

(covers up point number 2, 4, 5)

Using jq we can fetch/parse the layer id we desire by just passing the name to find it using the above command
after getting the layer id we can describe this layer to find out how many servers are we running under that layer. According to our QA infra diagram with two layers. The layer named as “qa.abc.test.com” has one instance which is hosting the domain.

aws opsworks –region us-east-1 describe-instances –layer-id $OPS_LAYER_ID | jq ‘.Instances[].PublicIp’| wc -l

(covers number 3)

This command over here will simply return us the number of public IP of servers inside the layer.

aws opsworks –region us-east-1 describe-instances –layer-id $OPS_LAYER_ID | jq ‘.Instances[].PublicIp’

(covers number 6)

This is the same command but will simply send out the values of the public IPs

we can ssh/ login into that server’s public IP and then we can run the deploy.sh shell script which will contain all the steps which are required to code deploy (like git pull or git clone etc .)

Here is what the Jenkins job code would look like:

 #!/bin/bash -x






subject=" anything you want"

mail -s $subejct “email ids ”

if [ -z "$BRANCH_NAME" ]; then

        echo "Exiting..Git branch name not set..!!"

    exit 1


echo "ENV=$ENV" > sshenv.txt

echo "BRANCH_NAME=$BRANCH_NAME" >> sshenv.txt

#this following command is imp anyways as we will need the stack id, so this command shall remain universal for any deployment

OPS_STACK_ID=$(aws opsworks describe-stacks | jq '.Stacks[] | select(.Name=="'"$ENV-stack"'")'| jq '.StackId' | cut -d '"' -f2)

OPS_LAYER_ID=$(aws opsworks --region us-east-1 describe-layers --stack-id $OPS_STACK_ID |jq '.Layers[] | select (.Name=="'"qa.abc.test-layer"'")'| jq '.LayerId' | cut -d '"' -f2)

#now! that we have got the id we shall take a count of instances and its pub dns entries (instances on which we are going to deploy.)

OPS_INSTANCE_COUNT=$(aws opsworks --region us-east-1 describe-instances --layer-id $OPS_LAYER_ID  | jq '.Instances[].PublicIp' | cut -d '"' -f2| wc -l)

if [ "$OPS_INSTANCE_COUNT" -eq "0" ];then

        echo "Exiting as no servers present for deployment"

        exit 1

### Run deploy via ssh in each server

for i in $(aws opsworks --region us-east-1 describe-instances --layer-id $OPS_LAYER_ID  | jq '.Instances[].PublicIp' | cut -d '"' -f2);


        echo $i; ##TESTING

    sudo scp -i $keyfile -o StrictHostKeyChecking=no sshenv.txt $user@$i:/tmp/envfile

    sudo ssh -i $keyfile -o StrictHostKeyChecking=no $user@$i 'bash -x /var/scripts/deploy.sh'



Here I conclude that with OpsWorks, Jenkins and AWS CLI we have thus created a High availability architecture with one click/ AUTO deployment configured.

ELK Setup And Email Alerting

What’s ELK Stack?

“ELK” is the acronym for three open source projects: Elasticsearch, Logstash, and Kibana.

Introduction: ELK services:

Elasticsearch: It’s a search and analytics engine.

Logstash: It’s a server‑side data processing pipeline that is for centralized logging, log enrichment and parsing

Kibana: It lets users visualize data with charts and graphs in Elasticsearch.

Note: I will be using Amazon EC2 instances with Ubuntu 16.04 LTS operating system for whole ELK and alerting setup.

We need Java8 for this setup first!!!

  • Setup Java8 on Ubuntu EC2 server by following commands:

$ sudo add-apt-repository -y ppa:webupd8team/java

$ sudo apt-get update -y

$ sudo apt-get -y install oracle-java8-installer

Lets Start with ELK services setup !!!

1) Setup Elasticsearch in our Ubuntu EC2 servers:

  • Download and install the public signing key:

$ wget -qO – https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add –

  • Installing from the APT repository

$ sudo apt-get install apt-transport-https

  • Save the repository definition to /etc/apt/sources.list.d/elastic-5.x.list:

$ echo “deb https://artifacts.elastic.co/packages/5.x/apt stable main” | sudo tee -a  /etc/apt/sources.list.d/elastic-5.x.list

  • Finally install Elasticsearch

$ sudo apt-get update && sudo apt-get install elasticsearch

  • After installing Elasticsearch , do following changes in configuration file “elasticsearch.yml”:
  • Changes in configuration file:

$ sudo vim /etc/elasticsearch/elasticsearch.yml

“network.host” to localhost

  • After saving and exiting file:

$ sudo systemctl restart elasticsearch

$ sudo systemctl daemon-reload

$ sudo systemctl enable elasticsearch


2) Setup Kibana in our Ubuntu EC2 servers:

  • Download and install the public signing key:

$ wget -qO – https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add –

  • Installing from the APT repository

$ sudo apt-get install apt-transport-https

  • Save the repository definition to /etc/apt/sources.list.d/elastic-5.x.list:

$ echo “deb https://artifacts.elastic.co/packages/5.x/apt stable main” | sudo tee -a  /etc/apt/sources.list.d/elastic-5.x.list

  • Finally install Kibana

$ sudo apt-get update && sudo apt-get install kibana

  • After installing Kibana, do following changes in configuration file “kibana.yml”:
  • For doing changes in cnfiguration file:

$ sudo vim /etc/kibana/kibana.yml

“server.host” to localhost.

  • After saving and exiting file:-

$ sudo systemctl restart kibana

$ sudo systemctl daemon-reload

$ sudo systemctl enable kibana


3) Setup Nginx for Kibana UI:

  • Following commands to install nginx:

$ sudo apt-get -y install nginx

$ echo “kibanaadmin:`openssl passwd -apr1`” | sudo tee -a /etc/nginx/htpasswd.users

  • Change nginx configuration file to listen at port 80 for Kibana:

$ sudo nano /etc/nginx/sites-available/default

  • Add the following in the file and save and exit file:
server {

listen 80;

server_name example.com;

auth_basic "Restricted Access";

auth_basic_user_file /etc/nginx/htpasswd.users;

location / {

proxy_pass http://localhost:5601;

proxy_http_version 1.1;

proxy_set_header Upgrade $http_upgrade;

proxy_set_header Connection 'upgrade';

proxy_set_header Host $host;

proxy_cache_bypass $http_upgrade;




  • For checking Nginx configuration:-

$ sudo nginx -t

$ sudo systemctl restart nginx

$ sudo ufw allow ‘Nginx Full’


4) Setup Filebeat on a different EC2 server with Amazon Linux image, from where logs will come to ELK:

  • Following commands to install filebeat:

$ sudo yum install filebeat

$ sudo chkconfig –add filebeat

  • Changes in Filebeat config file, here we can add different types of logs [ tomcat logs, application logs, etc] with their paths:-




– /var/log/tomcat/

input_type: log

document_type: tomlog

registry_file: /var/lib/filebeat/registry



hosts: [“elk_server_ip:5044”]

bulk_max_size: 1024

  • After changes done, save and exit the file and restart filebeat service with:-

$ service filebeat restart

$ service filebeat status [It should be running].

5) Setup Logstash in our ELK Ubuntu EC2 servers:

  • Following commands via command line terminal:

$ sudo apt-get update && sudo apt-get install logstash

$ sudo systemctl start logstash.service

  • Logstash Parsing of variables to show in Kibana:-

Make file in /etc/logstash/conf.d as “tomlog.conf” and add the following:

input {

beats {

port => 5044

ssl => false



filter {

if [type] == "tomlogs" {

grok {

patterns_dir => ["./patterns"]

match => { "message" => "%{IPORHOST:IP} - - %{SYSLOG5424SD:timestamp}%{SPACE}%                                                              {NOTSPACE:deviceId}%{SPACE}%{NOTSPACE:sessionId}%{SPACE}%                                                                                  {NOTSPACE:latitude}%{SPACE}%{NOTSPACE:longitude}%{SPACE}%                                                                                  {QUOTEDSTRING:query_string}%{SPACE}%{NOTSPACE:response}%{SPACE}%                                                                {NOTSPACE:byte_sent}%{SPACE}%{NOTSPACE:response_in_milli}" }




output {

elasticsearch {

hosts => ["localhost:9200"]

sniffing => true

manage_template => false

index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"

document_type => "%{[@metadata][type]}"


  • For testing Logstash config:

$ sudo /usr/share/logstash/bin/logstash –configtest -f /etc/logstash/conf.d/

  • After that restart all ELK services:

$ service logstash restart

$ service elacticsearch restart

$ service kibana restart


6) Browser: Kibana UI setup:

  • To load Kibana UI:

Search: http://elk_server_public_ip

  • Then add username and password for Kibana: admin/change_password.
  1. Go to Management: Add index: filebeat-* .
  2. Discover page should now show your system logs parsed under filebeat-* index.

7) Setup Elastalert for Email Alerting system:

  • SSH again in ELK EC2 Ubuntu server and do following:
  • Add cron in /etc/crontab:-
00 7    * * *   ubuntu    cd /home/ubuntu/elastalert && ./run_elastalert_with_cron.sh
  • Install elastalert on Ubuntu server:

$ git clone https://github.com/Yelp/elastalert.git

$ pip install “setuptools>=11.3”

$ python setup.py install

$ elastalert-create-index

  • Create rule:

$ cd  ~/elastalert/rules && touch rule.yml

$ cd ~/elastalert

$ vim config.yml

  • Add following for main config.yml file:

es_host: “localhost”

es_port: 9200

name: “ELK_rule_twiin_production”

type: “frequency”

index: “filebeat-*”

num_events: 1


hours: 24


– query:


query: ‘type:syslogs AND SYSLOGHOST:prod_elk_000’


– “email”


minutes: 0

email: “example@email.com

  • Changes in rules/rule.yml:

rules_folder: rules


minutes: 30


hours: 24

es_host: ‘localhost’

es_port: 9200

writeback_index: elastalert_status

smtp_host: ‘localhost’

smtp_port: 25


days: 0

from_addr: “your_addr@domain.com


  • Cron script for running Elkastalert to send Email once a day:-



curl -X GET “http://localhost:9200/elastalert_status

curl -X DELETE “http://localhost:9200/elastalert_status

pkill -f elastalert.elastalert

cd /home/ubuntu/elastalert

cd /home/ubuntu/elastalert && python -m elastalert.elastalert –config config.yaml –rule rules/frequency.yaml –verbose

Save and exit Shell script.

Now, it will run daily at 7:00 UTC according to my cron setup in the first step!!!

Let’s have a demo !!!

1) SSH to EC2 server [my case its Amazon Linux machine] where logs are getting generated:-

  • Go to the directory where we installed Filebeat service to create indices of our logs:

$ cd /etc/filebeat

$ vi filebeat.yml

  • Put your ELK server’s IP address for getting output in that server:-

  • Logs for Filebeat service will be stored at location: /var/log/filebeat:

$ cd /var/log/filebeat

$ ls –ltrh

  • The logs we need for indexing via Filebeat will be our tomcat logs, we can have them anywhere:-
  • In my case , they are at location: /var/log/containers/:

  • Now, let’s jump to ELK server to check how indices are coming there:-
  • Check the status of all ELK services:-

  • Go to Logstash conf directory to see the parsing of logs:

$ cd /etc/logstash/conf.d

$ vi tomlog.conf

  • Logs for Logstash service , will be at location: /var/log/logstash:

  • Now let’s check our Elasticsearch conf, at location /etc/elasticsearch:
  • You can define indices storage location and host in elasticsearch.yml:-

  • We can check elasticsearch services’s logs at /var/log/elasticsearch:

  • Lets checkout Kibana conf at location /etc/kibana:
  • We just need to define the host for Kibana UI:

  • Now, let’s move to Kibana UI

In browser: http://public_ip_elk_server

  • Enter username and password for Kibana, then it all starts:

  • Once you are logged in, you will go to Management:

  • Create the index for logs, for me, it’s filebeat-*:

  • As soon we create index pattern: All parsed fields will come there with details as

  • Then we can search our query normally and add fields we want to check values of as:

  • Let’s check, how to create a visualization in Kibana:

  • We can choose our desired graph pattern:

  • I am choosing Vertical bar:

  • Now for graphs, we need parameters to be framed in X-Y axis:
  • Let’s make a new search and save it first for Visualizations:

My search is:- “type:tomlogs AND response:200”:

  • Now save the search in Kibana for further making visualizations:

  • Now go to the Visualization board again:

  • Now we can select our saved search to make visualizations:
  • We selected our Y-axis to be a field: Response with Aggregation: Average:

  • Now save your visualization:

  • Now I made more responses to make a dashboard further:

  • Now let’s jump to Kibana Dashboards:

  • Create a dashboard:
  • As we start, Kibana will ask to add visualizations:

  • I chose both of my visualizations:

  • Save your dashboard:

  • Now you can browse more out of your dashboard according to Time Range at right top corner:

  • Just for little insight of Dev Tools:
  • We can query using APIs for our mappings or templates or Indices:
  • Let’s have all indices for the query as:

Hence, we have covered almost all features of Kibana !!!


The ELK Stack is a fantastic piece of software. ELK stack is most commonly used in log analysis in IT environments (though there are many more use cases for the ELK Stack starting including business intelligence, security and compliance, and web analytics).

Logstash collects and parses logs, and then Elasticsearch indexes and stores the information.

Kibana then presents the data in visualizations that provide actionable insights into one’s environment.

Custom Debian package creation method

This blog explains how to create a custom package in Debian OS. We will see how to configure a customize Debian package and create customize command.

Debian package creation in Ubuntu/Debian:

We will create the helloworld package without any command and with command like e.g.- ipaddr (to get external ip)

Create a debian package:

Let’s start creating empty directory helloworld. This folder will contain both source code and the Debian build instructions.

1.1 Create the debian/ Directory

Here we will use dh_make command to create the debian/ Directory:

$ mkdir -p Debian/helloworld-1.0
$ cd Debian/helloworld-1.01.    Create debian/ folder with example files (.ex)
$ dh_make \
–native \
–single \
–packagename helloworld_1.0.0 \
–email mukesh.waghadhare@talentica.com

This created the debian/ folder. Explore them! Especially the example files (*.ex) as well as most importantly the files:

  • debian/control
  • debian/changelog
  • debian/rules

1.2 Build the first (empty) Package

The dpkg-buildpackage command can be used to build  first package:

$ dpkg-buildpackage -uc -us

(-us, –unsigned-source Do not sign the source package (long option since dpkg 1.18.8).

-uc, –unsigned-changes Do not sign the .changes file (long option since dpkg 1.18.8).)

That’s it! This command build four files:

root@ip-172-31-19-221:/final_dpkg# ls -lrth

-rw-r–r– 1 root root 7.2K Jun 20 10:47 helloworld_1.0.0.tar.xz

-rw-r–r– 1 root root 573 Jun 20 10:47 helloworld_1.0.0.dsc

-rw-r–r– 1 root root 2.0K Jun 20 10:47 helloworld_1.0.0_amd64.deb

-rw-r–r– 1 root root 4.7K Jun 20 10:47 helloworld_1.0.0_amd64.buildinfo

-rw-r–r– 1 root root 1.5K Jun 20 10:47 helloworld_1.0.0_amd64.changes

  • .tar.gz: Source package, contains the contents of the helloworld/ folder
  • .deb: Debian package, contains the installable package
  • .dsc/.changes: Signature files, cryptographic signatures of all files

Let’s examine the contents with the dpkg -c command:

root@ip-172-31-19-221:/final_dpkg# dpkg -c helloworld_1.0.0_amd64.deb
drwxr-xr-x root/root 0 2018-06-20 09:03 ./
drwxr-xr-x root/root 0 2018-06-20 09:03 ./usr/
drwxr-xr-x root/root 0 2018-06-20 09:03 ./usr/share/
drwxr-xr-x root/root 0 2018-06-20 09:03 ./usr/share/doc/
drwxr-xr-x root/root 0 2018-06-20 09:03 ./usr/share/doc/helloworld/
-rw-r–r– root/root 187 2018-06-20 09:03 ./usr/share/doc/helloworld/README.Debian
-rw-r–r– root/root 144 2018-06-20 09:03 ./usr/share/doc/helloworld/changelog.gz
-rw-r–r– root/root 1413 2018-06-20 09:03 ./usr/share/doc/helloworld/copyright

Nothing really in the package except the default changelog, copyright and README file.

1.3 Install the (empty) package

Let’s install it with dpkg -i command:

root@ip-172-31-19-221:/final_dpkg/helloworld-1.0# dpkg -i ../helloworld_1.0.0_amd64.deb
(Reading database … 49055 files and directories currently installed.)
Preparing to unpack ../helloworld_1.0.0_amd64.deb …
Unpacking helloworld (1.0.0) over (1.0.0) …
Setting up helloworld (1.0.0) …

Done. That installed the package. Check out that it was actually installed by listing the installed packages with dpkg -l

# Proof that it’s actually installed

$ dpkg -l | grep helloworld

ii  helloworld       1.0.0        amd64  <insert up to 60 chars description>

# ^^     ^                      ^            ^           ^

# ||       |                       |            |            |

# ||       |                       |            |             – Description

# ||       |                       |             – Architecture

# ||       |                        – Version

# ||       – Package name

# | – Actual/current package state (n = not installed, i = installed, …)

#  – Desired package state (i = install, r = remove, p = purge, …)

The columns in the list are desired package stateactual package statepackage namepackage versionpackage architecture and a short package description. If all is well, the first column will contain ii, which means that the package is properly installed.

Now that the package is installed, you can list its contents with the dpkg -L

root@ip-172-31-19-221:/final_dpkg# dpkg -L helloworld

2.0 Adding files and updating the changelog

Let’s now add actual files in empty package. To do that, let’s create a folder file/usr/bin and add scripts to check IP address under ipaddr file.

$ mkdir -p files/usr/bin

$ touch files/usr/bin/ipaddr

$ chmod +x files/usr/bin/ipaddr

$ vi files/usr/bin/ipaddr

# Script contents see below

For this demo, we’ll use a short script to grab the public IP address from a service called ipify. It provides an API to return the IP address in various formats. We’ll grab it with curl in JSON and then use jq to parse out the ‘ip’ field:


curl –silent ‘https://api.ipify.org?format=json’ | jq .ip –raw-output

If we were to rebuild the package now, the dpkg-buildpackage command, we wouldn’t know which files to include in the package. So we’ll create the Debian/install file to list directories to include (e.g. vi Debian/install):

files/usr/* usr

This basically means that everything in the files/usr/ folder will be installed at /usr/ on the target file system when the package is installed.

Once this is done, we can just rebuild and reinstall the package, but let’s go one step further and update the version of the package to 1.1.0. Versions are handled by the Debian/changelog file. You can update it manually, or use the dchscript (short for “Debian changelog”) to do so:

# We changed the package, let’s update the version and changelog for it to 1.0.3

$ dch -im

# Opens the editor …

root@ip-172-31-19-221:/final_dpkg/helloworld-1.0# cat debian/changelog

helloworld (1.0.1) unstable; urgency=medium

* Added ‘ipaddr’ script

— root <mukesh.waghadhare@talentica.com> Thu, 21 Jun 2018 07:19:02 +0000

helloworld (1.0.0) unstable; urgency=medium

* Initial Release.

— root <mukesh.waghadhare@talentica.com> Wed, 20 Jun 2018 09:03:21 +0000

Let’s rebuild and reinstall the package:

root@ip-172-31-19-221:/final_dpkg/helloworld-1.0# dpkg-buildpackage -uc -us

(-us, –unsigned-source Do not sign the source package (long option since dpkg 1.18.8).

-uc, –unsigned-changes Do not sign the .changes file (long option since dpkg 1.18.8).)

Looks like it installed correctly. Let’s check if the ipaddr script is where it’s supposed to be. And then let’s try to run it:

root@ip-172-31-19-221:/final_dpkg/helloworld-1.0# dpkg -l | grep helloworld
ii helloworld 1.0.1 amd64 <insert up to 60 chars description>
root@ip-172-31-19-221:/final_dpkg/helloworld-1.0# which ipaddr
root@ip-172-31-19-221:/final_dpkg/helloworld-1.0# ipaddr
/usr/bin/ipaddr: line 2: jq: command not found
(23) Failed writing body

2.1 Updating description and adding dependencies

Each Debian package can depend on other packages. In the case of our ipaddr script, we use the curl and jqcommands, so the ‘helloworld’ package depends on these commands.

To add the dependencies to the ‘helloworld’ package, edit the Depends: section in the debian/control file (e.g. via vi debian/control):

root@ip-172-31-19-221:/final_dpkg/helloworld-1.0# vim debian/control
Source: helloworld
Section: utils
Priority: optional
Maintainer: mukeshw <mukesh.waghadhare@talentica.com>
Build-Depends: debhelper (>= 9)
Standards-Version: 3.9.8
Homepage: <insert the upstream URL, if relevant>Package: helloworld
Architecture: any
Depends: ${shlibs:Depends}, ${misc:Depends}, curl, jq
Description: Network management tools
Includes various tools for network management.

Again update the version and rebuild the package

Change the version:

root@ip-172-31-19-221:/final_dpkg/helloworld-1.0# dch -im

root@ip-172-31-19-221:/final_dpkg/helloworld-1.0# cat debian/changelog
helloworld (1.0.3) stable; urgency=medium

* Updated description and depndancy

— root <mukesh.waghadhare@talentica.com> Thu, 21 Jun 2018 07:19:48 +0000

helloworld (1.0.2) unstable; urgency=medium

* Added ‘ipaddr’ script

— root <mukesh.waghadhare@talentica.com> Thu, 21 Jun 2018 07:19:02 +0000

helloworld (1.0.0) unstable; urgency=medium

* Initial Release.

— root <mukesh.waghadhare@talentica.com> Wed, 20 Jun 2018 09:03:21 +0000


root@ip-172-31-19-221:/final_dpkg/helloworld-1.0# dpkg-buildpackage -uc -us


root@ip-172-31-19-221:/final_dpkg/helloworld-1.0# dpkg -i ../helloworld_1.0.3_amd64.deb
(Reading database … 49056 files and directories currently installed.)
Preparing to unpack ../helloworld_1.0.3_amd64.deb …
Unpacking helloworld (1.0.3) over (1.0.1) …
dpkg: dependency problems prevent configuration of helloworld:
helloworld depends on jq; however:
Package jq is not installed.

dpkg: error processing package helloworld (–install):
dependency problems – leaving unconfigured
Errors were encountered while processing:

unlike apt-get install, the dpkg -i command does not automatically resolve and install missing dependencies. It just highlights them. That is perfectly normal and expected. In fact, it gives us the perfect opportunity to check the package state (like we did above):

root@ip-172-31-19-221:/final_dpkg/helloworld-1.0# dpkg -l | grep helloworld
iU  helloworld 1.0.3 amd64 Network management tools# ^^                                           ^# ||                                             |# ||                                             – Yeyy, the new description# | – Actual package state is ‘U’ / ‘Unpacked’#  – Desired package state is ‘i’ / ‘install’

As you can see by the output, the desired state for the package is ‘i’ (= installed), but the actual state is ‘U’ (= Unpacked). That’s not good. Luckily though, dependencies can be automatically resolved by apt-get install -f

root@ip-172-31-19-221:/final_dpkg/helloworld-1.0# apt-get install -f
Reading package lists… Done
Building dependency tree
Reading state information… Done
Correcting dependencies… Done
The following additional packages will be installed:
jq libjq1 libonig4
The following NEW packages will be installed:
jq libjq1 libonig4
0 upgraded, 3 newly installed, 0 to remove and 28 not upgraded.
1 not fully installed or removed.
Need to get 327 kB of archives.
After this operation, 1,157 kB of additional disk space will be used.
Do you want to continue? [Y/n] y

Processing triggers for man-db ( …

Setting up jq (1.5+dfsg-1.3) …

Setting up helloworld (1.0.3) …

Finally, its installed correctly. Now, let’s test it…

root@ip-172-31-19-221:/final_dpkg/helloworld-1.0# dpkg -l | grep helloworld
ii helloworld 1.0.3 amd64 Network management toolsroot@ip-172-31-19-221:/final_dpkg/helloworld-1.0# ipaddr


This method makes it easy to install dependencies and multiple packages which we might require at bootup. It also installs a cluster of dependencies and customized packaged with the help of just one customized package.

AWS Batch Jobs

What is batch computing?

Batch computing means running jobs asynchronously and automatically, across one or more computers.

What is AWS Batch Job?

AWS Batch enables developers to easily and efficiently run hundreds of thousands of batch computing jobs on AWS. AWS Batch dynamically provisions the optimal quantity and type of compute resources (for example, CPU or memory optimized instances) based on the volume and specific resource requirements of the batch jobs submitted. AWS Batch plans, schedules, and executes your batch computing workloads across the full range of AWS compute services and features, such as Amazon EC2 and Spot Instances.

Why use AWS Batch Job ?

  • Fully managed infrastructure – No software to install or servers to manage. AWS Batch provisions, manages, and scales your infrastructure.
  • Integrated with AWS – Natively integrated with the AWS Platform, AWS Batch jobs can easily and securely interact with services such as Amazon S3, DynamoDB, and Recognition.
  • Cost-optimized Resource Provisioning – AWS Batch automatically provisions compute resources tailored to the needs of your jobs using Amazon EC2 and EC2 Spot.

AWS Batch Concepts

  • Jobs
  • Job Definitions
  • Job Queue
  • Compute Environments


Jobs are the unit of work executed by AWS Batch as containerized applications running on Amazon EC2. Containerized jobs can reference a container image, command, and parameters or users can simply provide a .zip containing their application and AWS will run it on a default Amazon Linux container.

$ aws batch submit-job –job-name poller –job-definition poller-def –job-queue poller-queue

Job Dependencies

Jobs can express a dependency on the successful completion of other jobs or specific elements of an array job.

Use your preferred workflow engine and language to submit jobs. Flow-based systems simply submit jobs serially, while DAG-based systems submit many jobs at once, identifying inter-job dependencies.

Jobs run in approximately the same order in which they are submitted as long as all dependencies on other jobs have been met.

$ aws batch submit-job –depends-on 606b3ad1-aa31-48d8-92ec-f154bfc8215f …

Job Definitions

Similar to ECS Task Definitions, AWS Batch Job Definitions specify how jobs are to be run. While each job must reference a job definition, many parameters can be overridden.

Some of the attributes specified in a job definition are:

  • IAM role associated with the job
  • vCPU and memory requirements
  • Mount points
  • Container properties
  • Environment variables
$ aws batch register-job-definition –job-definition-name gatk –container-properties …

Job Queues

Jobs are submitted to a Job Queue, where they reside until they are able to be scheduled to a compute resource. Information related to completed jobs persists in the queue for 24 hours.

$ aws batch create-job-queue –job-queue-name genomics –priority 500 –compute-environment-order …


Compute Environments

Job queues are mapped to one or more Compute Environments containing the EC2 instances that are used to run containerized batch jobs.

Managed (Recommended) compute environments enable you to describe your business requirements (instance types, min/max/desired vCPUs, and EC2 Spot bid as x % of On-Demand) and AWS launches and scale resources on your behalf.

We can choose specific instance types (e.g. c4.8xlarge), instance families (e.g. C4, M4, R3), or simply choose “optimal” and AWS Batch will launch appropriately sized instances from AWS more-modern instance families.

Alternatively, we can launch and manage our own resources within an Unmanaged compute environment. Your instances need to include the ECS agent and run supported versions of Linux and Docker.

$ aws batch create-compute-environment –compute- environment-name unmanagedce –type UNMANAGED …

AWS Batch will then create an Amazon ECS cluster which can accept the instances we launch. Jobs can be scheduled to your Compute Environment as soon as the instances are healthy and register with the ECS Agent.

Job States

Jobs submitted to a queue can have the following states:

  • SUBMITTED: Accepted into the queue, but not yet evaluated for execution
  • PENDING: The job has dependencies on other jobs which have not yet completed
  • RUNNABLE: The job has been evaluated by the scheduler and is ready to run
  • STARTING: The job is in the process of being scheduled to a compute resource
  • RUNNING: The job is currently running
  • SUCCEEDED: The job has finished with exit code 0
  • FAILED: The job finished with a non-zero exit code or was cancelled or terminated.

AWS Batch Actions

  • Jobs: SubmitJob, ListJobs, DescribeJobs, CancelJob, TerminateJob
  • Job Definitions: RegisterJobDefinition, DescribeJobDefinitions, DeregisterJobDefinition
  • Job Queues: CreateJobQueue, DescribeJobQueues, UpdateJobQueue, DeleteJobQueue
  • Compute Environments: CreateComputeEnvironment, DescribeComputeEnvironments, UpdateComputeEnvironment, DeleteComputeEnvironment

AWS Batch Pricing

There is no charge for AWS Batch. We only pay for the underlying resources we have consumed.

Use Case

Poller and Processor Service


Poller service needs to be run every hour like a cron job which submits one or more requests to a processor service which has to launch the required number of EC2 resource, process files in parallel and terminate them when done.


We plan to go with Serverless Architecture approach instead of using the traditional beanstalk/EC2 instance, as we don’t want to maintain and keep running EC2 server instance 24/7.

This approach will reduce our AWS billing cost as the EC2 instance launches when the job is submitted to Batch Job and terminates when the job execution is completed.

Poller Service Architecture Diagram

Processor Service Architecture Diagram

First time release

For Poller and Processor Service:

  • Create Compute environment
  • Create Job queue
  • Create Job definition

To automate above resource creation process, we use batchbeagle (for Installaion and configuration, please refer batch-deploymnent repository)

Command to Create/Update Batch Job Resources of a Stack (Creates all Job Descriptions, Job Queues and Compute Environments)

beagle -f stack/stackname/servicename.yml assemble

To start Poller service:

  • Enable a Scheduler using AWS CloudWatch rule to trigger poller service batch job.

Incremental release

We must create a new revision of existing Job definition environment which will point to the new release version tagged ECR image to be deployed.

Command to deploy new release version of Docker image to Batch Job (Creates a new revision of an existing Job Definition)


beagle -f stack/stackname/servicename.yml job update job-definition-name


Cloudwatch Events

We will use AWS Batch event stream for CloudWatch Events to receive near real-time notifications regarding the current state of jobs that have been submitted to your job queues.

AWS Batch sends job status change events to CloudWatch Events. AWS Batch tracks the state of your jobs. If a previously submitted job’s status changes, an event is triggered. For example, if a job in the RUNNING status moves to the FAILED status.

We will configure an Amazon SNS topic to serve as an event target which sends notification to lambda function which will then filter out relevant content from the SNS message (json) content and beautify it and send to the respective Environment slack channel .

CloudWatch Event Rule → SNS Topic → Lambda Function → Slack Channel

Batch Job Status Notification in Slack

Slack notification provides the following details:

  • Job name
  • Job Status
  • Job ID
  • Job Queue Name
  • Log Stream Name