Defence against Rising Bot Attacks

Bots are software that performs an automated task over the Internet. They are used for the productive task but they are frequently used for malicious activities. They are categorised as good and bad bots.

Good bots are used for positives purposes like Chabot for solving customer queries and web crawlers that are used for indexing search engines. A plain text file robot.txt can be placed in the root of site and rules can be configured in this file to allow/deny access to different site URLs. This way good bot can be controlled and are allowed to access certain site resources.

Then comes the bad bots, which are malicious programs and performs certain activities in the background in victims machine without the user’s knowledge.  Such activities include accessing certain websites without the user’s knowledge or stealing the user’s confidential information’s. These are also spread across the Internet to perform DDoS (Distributed Denial of service) attack on target websites. Following techniques can be used to deter malicious bots from accessing resource extensive API of web applications.

  1. Canvas Fingerprint:

Canvas fingerprint works on html5 canvas element. A small image is drawn on the canvas element of 1 x 1-pixel size. Each device generates a different hash of this image based on the browser, operating system and installed the graphics card. This technique is not sufficient enough to uniquely identify users because there will be certain group of users sharing the same configuration and device. But this has been observed that when the bot is scanning through the web pages it tries to access all link present on that landing page. So the time taken by the bot to click on the link on that page will always be similar. These two technique canvas fingerprint along with time to click can be combined to decide whether to provide access to the application or allow the user to validate that its a genuine user by shown a captcha. User will be given access to the site after successfully validating the captcha. So if this is observed that requests coming from the same device (same fingerprint) and with same time to click interval then this request will fall in bot category and has to be validated by showing captcha to the user. This will improve user experience, as captcha is not shown to all users but only for suspected requests.

  1. Honey Trap:

The honey trap works by having few hidden links on the landing pages along with the actual link. Since bots are going to try all link on the page when sniffing the landing page it will get trapped in the hidden link which points to 404 pages.  This data can be collected and use to identify the source of the request and later those sources can be blocked from accessing the website. Genuine user will only be able to see the actual link and will be able to access the site.

  1. Blacklist IP address:

Not the best solution because the bot is smart enough to change the IP address with each request. However, it will help to reduce some of the bot traffic by providing one layer of protection.

  1. Blacklist X-Requested-with:

There are some malicious apps that try to click certain links on the user’s device in the background and user does not know about this. Those requests have X-Requested-With header in the request which contains the app package name. The web application can be configured to block all requests that contain fraudulent X-Requested-With field value.

  1. Blacklist User-Agent:

There are many third-party service providers that maintains a list of User-Agents that bots use. The website can be configured to block all the requests coming from these blacklisted user agents. Similar to IP address this field can also be changed by bot owners with each request, hence does not provide full proof protection.

Conclusion

The separate defence can be used in different cases. There are a few advertising partners that try to convert the same user again and again just to increase their share. For such cases, canvas fingerprint will be the most suitable solution. It identifies the request coming from the same device several times within a time interval and mark it as bot traffic and ask the user to validate to proceed further.

Honey trap will be an ideal defence when there are many advertising partners working to bring in more traffic and we need to generate the report which partner provide bot traffic more. From this report bot source can be identified and later request coming from that traffic can be blocked. Other defence works on blacklisting, eg IP address, User-Agent and X-Requested-with. These are part of request headers and can be easily changed by bot from time to time. When using these blacklisting defences, the application owner needs to make sure they are using the most updated list of fraud causing agents. Given the pace of new frauds happening daily keeping track of updated fraud causing agent will be a challenge.

So this can be used as the first line of defence to filter most notorious bots and later canvas fingerprint will filter advanced bots.

Building a basic REST API using Django Rest Framework

An API (Application Programming Interface) is a software that allows two applications to talk to each other.

In this tutorial, We will explore different ways to create a Django Rest Framework (DFR) API. We will build Django REST application with Django 2.X.X that allows users to create, edit, and delete API.

Why DRF:

Django REST framework is a powerful and flexible toolkit for building Web APIs.

Some reasons you might want to use REST framework:

  • The Web browsable API is a huge usability win for your developers.
  • Authentication policies including packages for OAuth1a and OAuth2.
  • Serialization that supports both ORM and non-ORM data sources.
  • Customizable all the way down – just use regular function-based views if you don’t need the more powerful features.
  • Extensive documentation, and great community support.
  • Used and trusted by internationally recognized companies including Mozilla, Red Hat, Heroku, and Eventbrite.

Traditionally, Django is known to many developers as an MVC Web Framework, but it can also be used to build a backend, which in this case is an API. We shall see how you can build a backend with it.

Let’s get started

In this blog, you will be building a simple API for a simple employee management service.

Setup your Dev environment:

Please install python 3. I am using python 3.7.3 here

You can check your python version using the command

$ python –V

Python 3.7.3

After installing python, you can go ahead and create a working directory for your API and then set up a virtual environment.

You can set up virtual env. by below command

$ pip install virtualenv

Create directory employee-management and use that directory

$ mkdir employee-management  && cd employee-management
# creates virtual environment named drf_api
 employee-management $ virtualenv --python=python3 drf_api
# activate the virtual environment named drf_api
 employee-management e$ source drf_api/bin/activate

 

This will activate the virtual env that you have just created.

Let’s install Django and djangorestframework in your virtual env.

I will be installing Django 2.2.3 and djangorestframework 3.9.4

(drf_api) employee-management $pip install Django=2.2.3
(drf_api) employee-management $pip install djangorestframework =3.9.4

Start Project:

After setting up your Dev environment, let’s start a Django Project. I am creating a project with name API

 

(drf_api) employee-management $ django-admin.py startproject api

(drf_api) employee-management $ cd api

 

Now create a Django app. I am creating employees app

 

(drf_api) employee-management $ django-admin.py startapp employees

 

Now you will have a directory structure like this:

 

api/

    manage.py

    api/

        __init__.py

        settings.py

        urls.py

        wsgi.py

    employees/

        migrations/

             __init__.py

        __init__.py

        admin.py

        apps.py

        models.py

        tests.py

        views.py

    drf_api/

 

The app and project are now created. We will now sync the database.  By default, Django uses sqlite3 as a database.

If you open api/settings.py you will notice this:

 

DATABASES = {
     'default': {
         'ENGINE': 'django.db.backends.sqlite3',
         'NAME': os.path.join(BASE_DIR, 'db.sqlite3'),
     }
 }

 

You can change the DB engine as per your need. E.g PostgreSQL etc.

We will create an initial admin user and set a password for the use.

 

(drf_api) employee-management $ python manage.py migrate

(drf_api) employee-management $ python manage.py createsuperuser --email superhuman@blabla.com --username admin

 

Let’s add your app as API, open the api/settings.py file and add the rest_framework and employee apps to INSTALLED_APPS.

 

INSTALLED_APPS = [
     ...
     'rest_framework',
     
 'employees'
  
 ]


open the api/urls.py file and add urls for the 'employees' app;

...from django.contrib import admin
 from django.urls import path, include
 urlpatterns = [
     path('admin/', admin.site.urls),
     path('', include(employees.urls'))
 ]

 

This makes your basic setup ready and now you can start adding code to your employees’ service API.

TDD – Test Driver Development

Before we write the business logic of our API, we will need to write a test. So this is what we are doing: Write a unit test for a view and then update the code to make so that your test case works

Let’s Write a test for the GET employees/ endpoint

Let’s create a test for the endpoint that returns all songs: GET employees/.

Open the employees/tests.py file and add the following lines of code;

Add this code to git and make gist link

For now, let’s attach screenshots:

Do not try to run this code yet. We have not added the view or model code yet. Let’s add the view now.

Add the View for GET employees/ endpoint

Now we will add the code the view that will respond to the request GET employees/.

Model: First, add a model that will store the data about the employees that will be returned in the response. Open the employees /models.py file and the following lines of code.

 

 

We will add our model to the admin. This will help in running the admin part of the employees. Like, add/remove employee via admin UI  Lets add the following lines of code to the employees /admin.py file.

 

 

Now run make migrations from the command line

(drf_api) employee-management $ python manage.py makemigrations

Now run migrate command. This will create the employee’s table in your DB.

(drf_api) employee-management $ python manage.py migrate

 

Serializer: Add a serializer. Serializers allow complex data such as query sets and model instances to be converted to native Python datatypes that can then be easily rendered into JSON, XML or other content types.

Add a new file employees/serializers.py and add the following lines of code;

 

 

Serializers also provide deserialization, allowing parsed data to be converted back into complex types, after first validating the incoming data. The serializers in REST framework work very similarly to Django’s Form and ModelForm classes.

 

View: Finally, add a view that returns all songs. Open the employees/views.py file and the following lines of code;

 

 

Here hew has specified how to get the objects from the database by setting the queryset attribute of the class and specify a serializer that will be used in serializing and deserializing the data.

The view in this code inherits from a generic viewset ListViewSet

Connect the views

Before you can run the tests, you will have to link the views by configuring the URLs.

Open the api/urls.py file and add the following lines of code;

 

 

Now go to employees/urls.py and add below code;

 

Let’s run the test!

First, let’s run automated tests. Run the command;

 (drf_api) employee-management $ python manage.py test

 

The output in your shell should be similar to this;

 

How to test this endpoint manually?

From your command line run below command

 

(drf_api) employee-management $  nohup python manage.py runserver & disown

 

Now type in http://127.0.0.1:8000/admin/ in your browser. You now will be prompted for username and password. Enter admin username and password which we have created while doing create user step.

The screen will look like below once you log in :

 

 

Let’s add a few employees by adding add button

 

Once you added the employees. Let’s test our view employees API by hitting URL below

 

http://127.0.0.1:8000/api/v1/employees/

 

If you are able to see the above screen. This means your API works.

Congrats! Your first API using DRF is live.

Scala code analysis and coverage report on Sonarqube using SBT

Introduction

This blog is all about configuring scoverage plugin with SonarQube for tracking statement coverage as well as static code analysis for Scala project. SonarQube has support for many languages but it doesn’t have support for Scala- so this blog will guide through configuring sonar-scala and scoverage plugins to generate code analysis and code coverage reports.

The scoverage plugin for SonarQube reads the coverage reports generated by sbt coverage test and displays those in sonar dashboard.

Here are the steps to configure Scala projects with SonarQube for code coverage as well as static code analysis.

  1. Install sonarqube and start the server.
  2. Go to the Sonarqube marketplace and install `SonarScala` plugin.

This plugin provides static code analyzer for Scala language. It supports all the standard metrics implemented by sonarQube including Cognitive complexity.

  1. Add `Scoverage` plugin to Sonarqube from the marketplace

This plugin provides the ability to import statement coverage generated by Scoverage for scala projects. Also, this plugin reads XML report generated by Scoverage and populates several metrics in Sonar.

Requirements:

i.  SonarQube 5.1

ii. Scoverage 1.1.0

4. Now add the `sbt-sonar` plugin dependency to your scala project                   addSbtPlugin(“com.github.mwz” % “sbt-sonar” % “1.6.0”)

This sbt plugin can be used to run sonar-scanner launcher to analyze a Scala project with SonarQube.

Requirements:

i.  sbt 0.13.5+

ii. Scala 2.11/2.12

iii. SonarQube server.

iv. sonar-scanner (See point#5 for installation)

5. Configure `sonar-scanner` executable

 

6. Now, configure the sonar-properties in your project. This can be done in 2 ways

  • Use sonar-project.properties file:

This file has to be placed in your root directory. To use an external config file you can set the sonarUseExternalConfig to true.

import sbtsonar.SonarPlugin.autoImport.sonarUseExternalConfig

sonarUseExternalConfig := true

  • Configure Sonar-properties in build file:
  • By default, the plugin expects the properties to be defined in the sonarProperties setting key in sbt
import sbtsonar.SonarPlugin.autoImport.sonarProperties

sonarProperties ++= Map(

"sonar.sources" -> "src/main/scala",

"sonar.tests" -> "src/test/scala",

"sonar.modules" -> "module1,module2")
  1. Now run the below commands to publish code analysis and code coverage reports in your sonarQube server.
  • sbt  coverage test
  • sbt  coverageReport
  • sbt  sonarScan

 

SonarQube integration is really useful to perform an automatic review of code to detect bugs, code smells and security vulnerabilities.  SonarQube can also track history and provide the visual representation of it.

Introduction to Akka Streams

Why Streams?

In software development, there can be cases where we need to handle the potentially large amount of data. So while handling these kinds of scenarios there can be issues such as `out of memory` exceptions so we should divide the data in chunks and handle the chunks independently.

There come Akka streams for rescue to do this in a more predictable and less chaotic manner.

Introduction

Akka streams consist of 3 major components in it – Source, Flow, Sink – and any non-cyclical stream consist of at least 2 components Source, Sink and any number of Flow element. Here we can say Source and Sink are the special cases of Flow.

  • Source – this is the Source of data. It has exactly one output. We can think of Source as Publisher.
  • Sink – this is the Receiver of data. It has exactly one input. We can think of Sink as Receiver.
  • Flow – this is the Transformation that acts on the Source. It has exactly one input and one output.

Here Flow sits in between the Source and Sink as they are the Transformations applied on the Source data.

 

 

A very good thing is that we can combine these elements to obtain another one e.g combine Source and Flow to obtain another Source.

Akka streams are called reactive streams because of its backpressure handling capabilities.

What are Reactive Streams?

Applications developed using streams can run into problems if Source is generating data too fast than the Sink can handle. This causes Sink to buffer the data – but the problem is if data is too large then Sink buffer will also grow and can lead to memory issues.

So to handle this Sink need to communicate with the Source – to slow down the generation of data until it finished handling of current data.  This handle of communication between Publisher and Receiver is called as Backpressure handling. And Streams that handle this mechanism are called Reactive Streams.

Example using Akka Stream:

In this example, let’s try to find out prime numbers between 1 to 10000 using Akka stream. Akka stream version used is 2.5.11.

 

package example.akka

import akka.{Done, NotUsed}
import akka.actor.ActorSystem
import akka.stream.ActorMaterializer
import akka.stream.scaladsl._

import scala.concurrent.Future
object AkkaStreamExample {

def isPrime(i :Int) : Boolean = {
 if (i <= 1) false
 else if (i == 2) true
 else !(2 until i).exists(x => i % x == 0)
 }

def main(args: Array[String]): Unit = {
 implicit val system = ActorSystem("actor-system")
 implicit val materializer = ActorMaterializer()

val numbers = 1 to 10000

//Source that will iterate over the number sequence
 val numberSource: Source[Int, NotUsed] = Source.fromIterator(() => numbers.iterator)

//Flow for Prime number detection
 val isPrimeFlow: Flow[Int, Int, NotUsed] = Flow[Int].filter(num => isPrime(num))

//Source from original Source with Flow applied
 val primeNumbersSource: Source[Int, NotUsed] = numberSource.via(isPrimeFlow)

//Sink to print the numbers
 val consoleSink: Sink[Int, Future[Done]] = Sink.foreach[Int](println)

//Connect the Source with the Sink and run it using the materializer
 primeNumbersSource.runWith(consoleSink)
 }
}

 

Above example illustrated as a diagram:

 

 

  1. `Source` – based on the number iterator

`Source`, as explained already, represents a stream. Source takes two type parameters. The first one represents the type of data it emits and the second one is the type of the auxiliary value it can produce when ran/materialized. If we don’t produce any we use the NotUsed type provided by Akka.

The static methods to create Source are

  • fromIterator – its will accepts elements till iterator is empty
  • fromPublisher – uses object that provides publisher functionality
  • fromFuture – new Source from a given future
  • fromGraph – Graph is also a Source.
  1. `Flow` – filters out only prime numbers

Basically, `Flow` is an ordered set of transformations to the provided input. It takes 3 type parameters – input datatype, output datatype & auxiliary datatype.

We can create a Source by combining existing one and a Flow- as used in code

val primeNumbersSource: Source[Int, NotUsed] = numberSource.via(isPrimeFlow)

  1. `Sink` – prints numbers to the console

It is basically subscriber of the data and the last element of the Stream steps.

The sink is basically a Flow which uses foreach or fold function to run a procedure over its input elements and propagate the auxiliary value.

As with Source and Flow, the companion object provides a method for creating an instance of it. As mentioned above the two main methods of doing so are:

  • forEach – run the given function for each received element
  • foreachParallel – same as forEach – except runs in parallel
  • fold – run the given function for each received element, propagating the resulting value to the next iteration.

The runWith method produces a Future that will be completed when the Source is empty and Sink is finished with the processing of elements. If processing fails it returns Failure.

We can also create a RunnableGraph instance and run it manually using toMat (or viaMat).

  1. `ActorSystem` and `ActorMaterializer` are needed as Akka Stream uses Akka Actor model.

The `ActorMaterializer` class instance is needed to materialize a Flow into a Processor which represents a processing stage, which is a construct from the Reactive Streams standard, which Akka Streams implements.

In fact, Akka Streams employs back-pressure as described in the Reactive Streams standard mentioned above. Source, Flow, Sink get eventually transformed into low-level Reactive Streams constructs via the process of materialization.

App Store Connect API To Automate TestFlight Workflow

TestFlight

Most mobile application developers try to automate build sharing process as it is one of the most tedious tasks in an app development cycle. However, it always remained difficult especially for iOS developers because of Apple’s code signing requirements. So when iOS developers start thinking about automating build sharing, the first option which comes to their mind is TestFlight.

Before TestFlight acquisition by Apple, it was easy to automate build sharing process. TestFlight had it’s own public API’s (http://testflightapp.com/api) to upload and share builds from the command line. Developers used these API’s to write automation scripts. After Apple’s acquisition, TestFlight made part of app store connect and invalidated old API’s. Therefore to upload or share build developers had to rely on third-party tools like Fastlane.

App Store Connect API

In WWDC 2018, Apple announced new App Store Connect API and made it publicly available in November 2018. By using App Store Connect API, developers can now automate below TestFlight workflow without relying on any third party tool. The workflow includes:

In this short post, we will see a use case example of App Store Connect API for TestFlight.

Authentication

App Store Connect API is a REST API to access data from the Apple server. Use of this API requires authorization via JSON Web Token. API request without this token results in error “NOT_AUTHORIZED”. Generating the JWT Token is a tedious task. We need to follow the below steps to use the App Store Connect API:

  1. Create an API Key in app store connectportal
  2. Generate JWT token using above API key
  3. Send JWT token with API call

Let’s now deep dive into each step.

Creating the API Key

The API key is the pair of the public and private key. You can download the private key from App Store Connect and public key will be stored on the Apple server. To create the private key, follow the below steps:

  1. Login to app store connect portal
  2. Go to ‘Users and Access’ section
  3. Then select ‘Keys’ section

Account holder (Legal role) needs to request for access to generate the API key.

Once you get access, you can generate an API key.

There are different access levels for keys like Admin, App Manager, developer etc. Key with the ‘Admin’ access can be used for all App Store Connect API.

Once you generate the API key, you can download it. This key is available to download for a single time only, so make sure to keep it secure once downloaded.

The API key never expires, you can use it as long as it’s valid. In case you lose it, or it is comprised then remember to revoke it immediately. Because anyone who has this key can access your app store record.

Generate JWT Token

Now we have the private key required to generate the JWT token. To generate the token, we also need the below-mentioned parameters:

  1. Private key Id: You can find it on the Keys tab (KEY ID).
  2. Issuer Id: Once you generate the private key, you will get an Issuer_ID. It is also available on the top of the Keys tab.
  3. Token Expiry: The generated token can be used within a maximum of 20 minutes. It expires after lapse of the specified time.
  4. Audience: As of now it is “appstoreconnect-v1”
  5. Algorithm: The ES256 JWT algorithm is used to generate a token.

Once all the parameters are in place, we can generate the JWT token. To generate it, there is a Ruby script which is used in the WWDC demo.

require "base64"
require "jwt"
ISSUER_ID = "ISSUER_ID"
KEY_ID = "PRIVATE_KEY_ID"
private_key = OpenSSL::PKey.read(File.read("path_to_private_key/AuthKey_#{KEY_ID}.p8"))
token = JWT.encode(
 {
    iss: ISSUER_ID,
    exp: Time.now.to_i + 20 * 60,
    aud: "appstoreconnect-v1"
 },
 private_key,
 "ES256",
 header_fields={
 kid: KEY_ID }
)
puts token

 

Let’s take a look at the steps to generate a token:

  1. Create a new file with the name jwt.rb and copy the above script in this file.
  2. Replace Issuer_Id, Key_Id and private key file path values in the script with your actual
  3. To run this script, you need to install jwt ruby gemon your machine. Use the following command to install it: $ sudo gem install jwt
  4. After installing the ruby gem, run the above script by using the command: $ ruby jwt.rb

You will get a token as an output of the above script. You can use this token along with the API call! Please note that the generated token remains valid for 20 minutes. If you want to continue using it after 20 minutes, then don’t forget to create another.

Send JWT token with API call

Now that we have a token, let’s see a few examples of App Store Connect API for TestFlight. There are many APIs available to automate TestFlight workflow. We will see an example of getting information about builds available on App Store Connect. We will also look at an example of submitting a build to review process. This will give you an idea of how to use the App Store Connect API.

Example 1: Get build information:

Below is the API for getting the build information. If you hit this API without the jwt token, it will respond with an error

$ curl https://api.appstoreconnect.apple.com/v1/builds
{
 "errors": [{
 "status": "401",
 "code": "NOT_AUTHORIZED",
 "title": "Authentication credentials are missing or invalid.",
 "detail": "Provide a properly configured and signed bearer token, and make sure that it has not expired. Learn more about Generating Tokens for API Requests https://developer.apple.com/go/?id=api-generating-tokens"
 }]
}

So you need to pass above-generated jwt token in the request

$ curl https://api.appstoreconnect.apple.com/v1/builds --Header "Authorization: Bearer your_jwt_token”
{
"data": [], // Array of builds available in your app store connect account
"links": {
"self": "https://api.appstoreconnect.apple.com/v1/builds"
},
"meta": {
"paging": {
"total": 2,
"limit": 50
}
}
}

 

Example 2: Submit build for review process:

By using the above build API, you can get an ID for the build. Use this ID to submit a build for the review process. You can send the build information in a request body like:

{
 "data": {
 "type": "betaAppReviewSubmissions",
 "relationships": {
 "build": {
 "data": {
 "type": "builds",
 "id": “your_build_Id"
 }
 }
 }
 }
}

In the the above request body, you just need to replace your build ID. So the final request will look like:

$ curl -X POST -H “Content-Type: application/json” –data ‘{“data”:{“type”:”betaAppReviewSubmissions”,”relationships”:{“build”:{“data”:{“type”:”builds”,”id”:”your_build_Id”}}}}}’https://api.appstoreconnect.apple.com/v1/betaAppReviewSubmissions –Header “Authorization: Bearer your_jwt_token”

That’s it. The above API call will submit the build for the review process. This way you can use any other App Store Connect API like getting a list of beta testers or to manage beta groups.

Conclusion

We have seen the end-to-end flow for App store Connect API. By using these API you can automate TestFlight workflow. You can also develop tools to automate the release process without relying on any third-party tool. You can find the documentation for App Store Connect API here. I hope you’ll find this post useful. Good luck and have fun.

 

 

 

 

 

WebRTC – Basics of web real-time communication

WebRTC is a free open source standard for real-time, plugin-free video, audio and data communication between peers. Many solutions like Skype, Facebook, Google Hangout offer RTC but they need downloads, native apps or plugins. The guiding principles of the WebRTC project are that its APIs should be open source, free, standardized, built into web browsers and more efficient than existing technologies.

How does it work

  • Obtain a Video, Audio or Data stream from the current client.
  • Gather network information and exchange it with peer WebRTC enabled client.
  • Exchange metadata about the data to be transferred.
  • Stream audio, video or data.

That’s it ! .. well almost, it’s a dumbed down version of what actually happens. Since now you have an overall picture let’s dig into the details.

How it really works

WebRTC provides the implementation of 3 basic APIs to achieve everything.

  • MediaStream: Allowing the client to access a stream from a WebCam or microphone.
  • RTCPeerConnection: Enabling audio or video data transfer, with support for encryption and bandwidth management.
  • RTCDataChannel: Allowing peer-to-peer communication for any generic data.

Along with these capabilities, we will need a server (yes we still need a server !)  to identify the remote peer and to do the initial handshake. Once the peer has been identified we can directly transfer data between two peers if possible or relay the information using a server.

Let’s look at each of these steps in detail.

MediaStream

MediaStream has a getUserMedia() method to get access of Audio or Video or a data stream and provide success and failure handler.

 

navigator.getUserMedia(constraints, successCallback, errorCallback);

 

The constraints is a json which specifies if an audio or video access is required. In addition, we can specify some metadata about the constraints like video with and height, example:

 

navigator.getUserMedia({ audio: true, video: true}, successCallback, errorCallback);

 

RTCPeerConnection

This interface represents the connection between local WebRTC client and a remote peer. It is used to do the efficient transfer of data between the peers. Both the peers need to setup RtcPeerConnection at their end. In general, we use an RTCPeerConnection::onaddstream event callback to take care of audio/video stream.

  • The initiator of the call (the caller) needs to create an offer and send it to the callee, with the help of a signalling server.
  • Callee which receives the offer needs to create an answer and send it back to the caller using the signalling server.
ICE

It is a framework that allows web browsers to connect with peers. There are many reasons why a straight up connection from Peer A to Peer B simply won’t work. Most of the clients won’t have a public IP address as they are usually sitting behind a firewall and a NAT. Given the involvement of NAT, our client has to figure out the IP address of the peer machine. This is where Session Traversal Utilities for NAT (STUN) and Traversal Using Relays around NAT (TURN) servers come into the picture

STUN

A STUN server allows clients to discover their public IP address and the type of NAT they are behind. This information is used to establish a media connection. In most cases, a STUN server is only used during the connection setup and once that session has been established, media will flow directly between clients.

TURN

If a STUN server cannot establish the connection, ICE can switch to TURN. Traversal Using Relay NAT (TURN) is an extension to STUN, that allows media traversal over a NAT that does not allow a peer to peer connection required by STUN traffic. TURN servers are often used in the case of a symmetric NAT.

Unlike STUN, a TURN server remains in the media path after the connection has been established. That is why the term “relay” is used to define TURN. A TURN server literally relays the media between the WebRTC peers.

RTCDataChannel

The RTCDataChannel interface represents a bi-directional data channel between two peers of a connection. Objects of this type can be created using

 

RTCPeerConnection.createDataChannel()

 

Data channel capabilities make use of events based communication:

var peerConn= new RTCPeerConnection(),
     dc = peerConn.createDataChannel("my channel");
 
 dc.onmessage = function (event) {
   console.log("received: " + event.data);
 };

Links and References

Android life cycle aware components

What is a life cycle aware component?

A life cycle aware component is a component which is aware of the life cycle of other components like activity or fragment and performs some action in response to change in life cycle status of this component.

Why have life cycle aware components?

Let’s say we are developing a simple video player application. Where we have an activity named as VideoActivity which contains the UI to play the video and we have a class named as VideoPlayer which contains all the logic and mechanism to play a video. Our VideoActivity creates an instance of this VideoPlayer class in onCreate() method

 

 

Now as for any video player we would like it to play the video when VideoActivity is in foreground i.e, in resumed state and pause the video when it goes in background i.e when it goes in the paused state. So we will have the following code in our VideoActivity’s onResume() and onPause() methods.

 

 

Also, we would like it to stop playing completely and release the resources when the activity gets destroyed. Thus we will have the following code in VideoActivity’s onDestroy() method

When we analyze this code we can see that even for this simple application our activity has to take a lot of care about calling the play, pause and stop methods of VideoPlayer class. Now imagine if we add separate components for audio, buffering etc, then our VideoActivity has to take care of all these components inside its life cycle callback methods which leads to poor organization of code, prone to errors.

 

Using arch.lifecycle 

With the introduction of life cycle aware components in android.arch.lifecycle library, we can move all this code to the individual components. Our activities or fragments no longer need to play with these component logic and can focus on their own primary job i.e. to maintain UI. Thus, the code becomes clean, maintainable and testable.

The android.arch.lifecycle package provides classes and interfaces that prove helpful to solve such problems in an isolated way.

So let’s dive and see how we can implement the above example using life cycle aware components.

Life cycle aware components way

To keep things simple we can add the below lines to our app gradle file to add life cycle components from android.arch library

 

 

Once we have integrated the arch components we can make our VideoPlayer class implement LifecycleObserver, which is an empty interface with annotations.
Using the specific annotations with the VideoPlayer class methods it will be notified about the life cycle state changes in VideoActivity. So our VideoPlayer class will be like:

We are not done yet. We need some binding between this VideoPlayer class and the VideoActivity so that our VideoPlayer object gets notified about the life cycle state changes in VideoActivity.

Well, this binding is quite easy, VideoActivity is an instance of android.support.v7.app.AppCompatActivity which implements Lifecycleowner interface. Lifecycleowner interface is a single method interface which contains a method, getLifecycle(), to get the Lifecycle object corresponding to its implementing class which keeps track about the life cycle state changes of activity/fragment or any other component having a life cycle. This Lifecycle object is observable and notifies its observers about the change in state.

So we have our VideoPlayer, instance of LifecycleObserver, and we need to add this as an observer to the Lifecycle object of VideoActivity. So we will modify VideoActivity as:

Well it makes things quite resilient and isolated. Our VideoPlayer class logic is separated from VideoActivity. Our VideoActivity no longer needs to bother about calling its dependent components methods to pause or play in its life cycle callback methods which makes the code clean, manageable and testable.

Conclusion

The beauty of such separation of concern can be also felt when we are developing some library and we intend it to be used as a third party library. It should not be a concern for end users of our library i.e. developers who would be using our library, to call life cycle dependent methods of our library. They might miss it or may not be aware at all which methods to call (because developers don’t usually read the documentation completely) leading to memory leaks or worse app crashes.

Another use case can be when an activity depends on some network call handled by a  network manager class. We can make the network manager class life cycle aware so that it tries to supply the data to activity only when it is alive or better to not keep a reference to activity when it is destroyed. Thus, avoiding memory leaks.

We can develop a well managed app using the life cycle aware components provided by android.arch.lifecycle package. The resulting code will be loosely coupled and thus easy for modifications, testing and debugging which makes our life easy as developers.

Kotlin Kronicles for Android developers — part 1

Another blog on“why Kotlin”? cliché? Not really. This is more like a “why not Kotlin?” kind of blog post. This blog is my attempt to convince android app developers to migrate to Kotlin. It doesn’t matter if you have little or no knowledge of Kotlin, or you are an iOS developer who worships swift, read along, I am sure Kotlin will impress you (if not my writing).

I am going to show some of the amazing features of Kotlin programming language that makes development so much easy and fun. And makes the code so readable as if you are reading plain English. I read somewhere that “Programming language isn’t for computers, computers understand only 1s and 0s, it is for humans” I couldn’t agree more. There is a learning curve, sure, where isn’t? It pays off nicely. Kotlin makes us do more with fewer lines of code, kotlin makes us productive.

Lets quickly walk over some of the obvious reasons for migrating to Kotlin:

  • Kotlin is one of the officially supported languages for android app development as announced in Google IO 2017.
  • Kotlin is 100% interoperable with Java. Which basically means Kotlin can use Java classes and methods and vice versa.
  • Kotlin has several modern programming language features like lambdas, higher order functions, null safety, extensions etc.
  • Kotlin is developed and maintained by JetBrains, which is the company behind several integrated development environments that developers use every day (or IDEs like IntelliJ IDEA, PyCharm, PhpStorm, GoLand etc).

This is available all over the internet. This is the content of “Why Kotlin” category of blogs.

Let’s talk about something a little more interesting.

Higher Order Functions:

Kotlin functions are first class citizens. Meaning functions can be stored in variables, passed as arguments or returned from other functions. A higher-order function is a function that takes a function as a parameter or returns a function.

This may sound strange at first. Why in the world would I pass a function to another function? (or return a function from another function) It is very common in various programming languages including javascript, swift, python (and Kotlin apparently). An excellent example of a higher function is the map. The map is a higher order function that takes in a function as a parameter and returns a list of results of applying the given function in each item of the original list or array.

checkout the map function above in line 3. It applies the stringStrirrer() function to each item of x. The result of the map operation is in line 4 above.

Data classes:

Java POJOs or Plain Old Java Objects or simply classes that store some data require a lot of boilerplate code most of the times, like getters, setters, equals, hashCode, toString etc. Kotlin data class derives these properties and functions automatically from properties defined in the primary constructor.

Just one line of code to replace the several lines of java POJO. For custom behavior we can override functions in data classes. Other than this Kotlin data classes are also bundled with copy, components etc which allows copying object and de-structuring respectively.

Dealing With Strings:

Kotlin standard library makes dealing with strings so much easier. Here is a sample:

No helper classes, public static methods or StringUtils is required. We can invoke these functions as if they belong to String class itself.

Dealing with Collections:

Same as String, the helper methods in “java.util.Collections” class are no longer required. We can directly call “sort, max, min, reverse, swap etc” on collections.

Consider a bank use case. A bank has many customers, a customer does several transactions every month. Think in terms of objects:

As it is clear from the picture above, a bank has many customers, a customer has some properties (name, list of transactions etc) and several transactions, a transaction has properties like amount, type etc. It will look something like this in Java:

Find the customer with minimum balance:

I don’t know about you but I think the Kotlin way is much more clean, simple and readable. And we didn’t import any helper class for that (Java way needed Collections class). I can read it as plain English, which is more than what I can say for the Java counterpart. The motive here is not to compare Java with Kotlin, but to appreciate the Kotlin Kronicles.

There are several functions like map, filter, reduce, flatmap, fold, partition etc. Here is how we can simplify our tasks by combining these standard functions (for each problem statement below, imagine doing it in Java):

As it is clear from the above gist, we can solve mundane problems with much fewer lines of code. Readability wise I just love it. Above code explanation here:

  1. FlatMap: Returns a single list of all elements yielded from results of transform function being invoked on each element of the original array (in above cases the transform function returned list of transactions by each individual user)
  2. Filter and SumBy: Here we combined filter and sum by operations to write a one-liner code to find the total amount deposited and withdrawn from the bank considering all customers.
  3. Fold: Accumulates value starting with the initial value (0.0 in our case) and applying operation (when statement above) from left to right to current accumulator value and each element. Here we used fold and when to find the net amount deposited in the bank considering all deposits and withdrawals.
  4. Partition: Splits the original array into a pair of lists, where the first list contains elements for which predicate (the separation function in this case) yielded true, while the 2nd where it yielded false. Of course, we can filter twice, but this is so much easier.

So many complex operations simplified by the Kotlin standard library.

Extensions:

One of my favourite features. Kotlin extensions let us add functionality to a class without modifying the original class. Just like in Swift and C#. This can be very handy if used properly. Check this out:

In the above code, we just added a new function called “toINR()” to Kotlin’s Double type. So we basically added a new function in Kotlin’s primitive type, how about that 😎. And it is a one-liner function, no curly braces, return type, return statement whatsoever. Noticed that conciseness did you?.

Since Kotlin supports higher order functions we can combine this with extension functions to solidify our code. One very common problem with android development involving SQLite is, developers often forget to end the transaction. Then we waste hours debugging it. Here is how we can avoid it:

We added an extension function called “performDBTransaction” in SQLiteDatabaseThis function takes a parameter that is a function with no input and no output, and this parameter function is whatever we want executed in between begin and end transactions. This function calls beginTransaction() then the passed operation and then calls endTransaction(). We can use this function wherever required without having to double check if we called endTransaction or not.

I always forget to call commit() or apply() when storing data in Shared Preferences. Similar approach:

extension function persist() (line 9 above) takes care of it. We are calling persist() as if it is a part of SharedPreferences.Editor.

Smart Casts:

Going back to our bank example. Let’s say the transaction can be of three types as explained in below figure:

NEFT transaction has fixed charges, IMPS has some bank-related charges. Now we deal with a transaction object, the super class “Transaction”. We need to identify the type of transaction so that the transaction can be processed accordingly. Here is how this can be handled in Kotlin:

In line 10 and 11 in above code gist, we didn’t cast the Transaction object into NEFT or IMPS, yet we are able to invoke the functions of these classes. This is a smart cast in Kotlin. Kotlin has automatically casted the transaction object into its respective type.

Epilogue:

As developers, we need to focus on stuff that matters, the core part, and boilerplate code isn’t one of them. Kotlin helps in reducing boilerplate code and makes development fun. Kotlin has many amazing features which ease development and testing. Do not let the fear of the unknown dictate your choice of the programming language; migrate your apps to kotlin now. The initial resistance is the only resistance.

I sincerely hope you enjoyed the first article of this Kotlin Kronicles series. We have just tapped the surface. Stay tuned for part 2. Let me know if you want me to cover anything specific.

Got any suggestions? shoot below in comments.

What is your favourite feature of Kotlin?

Keep Developing…

AWS Batch Jobs

What is batch computing?

Batch computing means running jobs asynchronously and automatically, across one or more computers.

What is AWS Batch Job?

AWS Batch enables developers to easily and efficiently run hundreds of thousands of batch computing jobs on AWS. AWS Batch dynamically provisions the optimal quantity and type of compute resources (for example, CPU or memory optimized instances) based on the volume and specific resource requirements of the batch jobs submitted. AWS Batch plans, schedules, and executes your batch computing workloads across the full range of AWS compute services and features, such as Amazon EC2 and Spot Instances.

Why use AWS Batch Job ?

  • Fully managed infrastructure – No software to install or servers to manage. AWS Batch provisions, manages, and scales your infrastructure.
  • Integrated with AWS – Natively integrated with the AWS Platform, AWS Batch jobs can easily and securely interact with services such as Amazon S3, DynamoDB, and Recognition.
  • Cost-optimized Resource Provisioning – AWS Batch automatically provisions compute resources tailored to the needs of your jobs using Amazon EC2 and EC2 Spot.

AWS Batch Concepts

  • Jobs
  • Job Definitions
  • Job Queue
  • Compute Environments

Jobs

Jobs are the unit of work executed by AWS Batch as containerized applications running on Amazon EC2. Containerized jobs can reference a container image, command, and parameters or users can simply provide a .zip containing their application and AWS will run it on a default Amazon Linux container.

$ aws batch submit-job –job-name poller –job-definition poller-def –job-queue poller-queue

Job Dependencies

Jobs can express a dependency on the successful completion of other jobs or specific elements of an array job.

Use your preferred workflow engine and language to submit jobs. Flow-based systems simply submit jobs serially, while DAG-based systems submit many jobs at once, identifying inter-job dependencies.

Jobs run in approximately the same order in which they are submitted as long as all dependencies on other jobs have been met.

$ aws batch submit-job –depends-on 606b3ad1-aa31-48d8-92ec-f154bfc8215f …

Job Definitions

Similar to ECS Task Definitions, AWS Batch Job Definitions specify how jobs are to be run. While each job must reference a job definition, many parameters can be overridden.

Some of the attributes specified in a job definition are:

  • IAM role associated with the job
  • vCPU and memory requirements
  • Mount points
  • Container properties
  • Environment variables
$ aws batch register-job-definition –job-definition-name gatk –container-properties …

Job Queues

Jobs are submitted to a Job Queue, where they reside until they are able to be scheduled to a compute resource. Information related to completed jobs persists in the queue for 24 hours.

$ aws batch create-job-queue –job-queue-name genomics –priority 500 –compute-environment-order …

 

Compute Environments

Job queues are mapped to one or more Compute Environments containing the EC2 instances that are used to run containerized batch jobs.

Managed (Recommended) compute environments enable you to describe your business requirements (instance types, min/max/desired vCPUs, and EC2 Spot bid as x % of On-Demand) and AWS launches and scale resources on your behalf.

We can choose specific instance types (e.g. c4.8xlarge), instance families (e.g. C4, M4, R3), or simply choose “optimal” and AWS Batch will launch appropriately sized instances from AWS more-modern instance families.

Alternatively, we can launch and manage our own resources within an Unmanaged compute environment. Your instances need to include the ECS agent and run supported versions of Linux and Docker.

$ aws batch create-compute-environment –compute- environment-name unmanagedce –type UNMANAGED …

AWS Batch will then create an Amazon ECS cluster which can accept the instances we launch. Jobs can be scheduled to your Compute Environment as soon as the instances are healthy and register with the ECS Agent.

Job States

Jobs submitted to a queue can have the following states:

  • SUBMITTED: Accepted into the queue, but not yet evaluated for execution
  • PENDING: The job has dependencies on other jobs which have not yet completed
  • RUNNABLE: The job has been evaluated by the scheduler and is ready to run
  • STARTING: The job is in the process of being scheduled to a compute resource
  • RUNNING: The job is currently running
  • SUCCEEDED: The job has finished with exit code 0
  • FAILED: The job finished with a non-zero exit code or was cancelled or terminated.

AWS Batch Actions

  • Jobs: SubmitJob, ListJobs, DescribeJobs, CancelJob, TerminateJob
  • Job Definitions: RegisterJobDefinition, DescribeJobDefinitions, DeregisterJobDefinition
  • Job Queues: CreateJobQueue, DescribeJobQueues, UpdateJobQueue, DeleteJobQueue
  • Compute Environments: CreateComputeEnvironment, DescribeComputeEnvironments, UpdateComputeEnvironment, DeleteComputeEnvironment

AWS Batch Pricing

There is no charge for AWS Batch. We only pay for the underlying resources we have consumed.

Use Case

Poller and Processor Service

Purpose

Poller service needs to be run every hour like a cron job which submits one or more requests to a processor service which has to launch the required number of EC2 resource, process files in parallel and terminate them when done.

Solution

We plan to go with Serverless Architecture approach instead of using the traditional beanstalk/EC2 instance, as we don’t want to maintain and keep running EC2 server instance 24/7.

This approach will reduce our AWS billing cost as the EC2 instance launches when the job is submitted to Batch Job and terminates when the job execution is completed.

Poller Service Architecture Diagram

Processor Service Architecture Diagram

First time release

For Poller and Processor Service:

  • Create Compute environment
  • Create Job queue
  • Create Job definition

To automate above resource creation process, we use batchbeagle (for Installaion and configuration, please refer batch-deploymnent repository)

Command to Create/Update Batch Job Resources of a Stack (Creates all Job Descriptions, Job Queues and Compute Environments)

beagle -f stack/stackname/servicename.yml assemble

To start Poller service:

  • Enable a Scheduler using AWS CloudWatch rule to trigger poller service batch job.

Incremental release

We must create a new revision of existing Job definition environment which will point to the new release version tagged ECR image to be deployed.

Command to deploy new release version of Docker image to Batch Job (Creates a new revision of an existing Job Definition)

 

beagle -f stack/stackname/servicename.yml job update job-definition-name

Monitoring

Cloudwatch Events

We will use AWS Batch event stream for CloudWatch Events to receive near real-time notifications regarding the current state of jobs that have been submitted to your job queues.

AWS Batch sends job status change events to CloudWatch Events. AWS Batch tracks the state of your jobs. If a previously submitted job’s status changes, an event is triggered. For example, if a job in the RUNNING status moves to the FAILED status.

We will configure an Amazon SNS topic to serve as an event target which sends notification to lambda function which will then filter out relevant content from the SNS message (json) content and beautify it and send to the respective Environment slack channel .

CloudWatch Event Rule → SNS Topic → Lambda Function → Slack Channel

Batch Job Status Notification in Slack

Slack notification provides the following details:

  • Job name
  • Job Status
  • Job ID
  • Job Queue Name
  • Log Stream Name