How to Build a SaaS Product with Both Data and Run-time Isolation?

After a startup considers SaaS implementation, choosing the right SaaS architecture type is highly imperative to not only ensure the right pricing model but also accommodate special design requirements, such as scalability and customizability. Also if you’re considering SaaS type 2 architecture, wherein isolation of both data and runtime environments is required, this article is a must-read. As an application architect working on enterprise software, let me walk you through how we helped a project management startup succeed by applying SaaS type 2.

The project management platform that we were working on was an enterprise-level software based on a well-established algorithm to perform an optimal schedule for different types of project environments. However, to further delve into providing scheduling solutions at a much granular level, the product was going through a major overhaul in terms of new functionality being provided in addition to existing solutions, along with a UI revamp to make it more user-friendly.

Challenges that Came Along

The main challenge was to get early feedback for the new functionality from the existing customers for quick incorporation into the product. At the same time, it was also necessary to give the product to a wide variety of potential customers for initial trials, get them on-board for long term engagement, and provide scheduling solutions based on their needs.

While we started placing our focus on reducing the cycle time for features, it wasn’t possible with the traditional model of deployment wherein the product was hosted in the customer’s environment. Therefore, we decided to provide the platform as a SaaS offering. However, the immediate next step was to pick the right SaaS architecture, and this was crucial considering its role in fostering the platform’s future growth.

Arrival at the ‘Make-or-Break’ Decision

Since every organization’s business model is different, the task management and execution could be different, so that these platforms are designed in a way that customization is easy for end-users. Moreover, the platform should be easily customizable for different customer environments. In one common time frame, multiple customers are going to use the platform to create portfolios for their organizations, which will hold very sensitive data specific to the businesses.

In this model, the customers were very clear and strict on the need to have complete isolation both at the application level as well as data level. We could agree that Type 2 architecture was the right fit for this case and decided to implement it using our experience of saasifying products for growth-stage startups from various domains.

Dealing with the Architectural Roadblocks

Given below are some of the architectural challenges that we encountered, and our approach of effectively tackling them to drive successful implementation-

Scaling

Each customer runs on a different scale; some customers have thousands of users using the platform for planning and execution. On the other hand, there are customers with very few top-level executives using the platform. Since we have the freedom to deploy the application at the customer level, the application was deployed keeping the size of user bases in mind.

Fast Customer Onboarding

New customers need to be onboarded quickly, with minimal assistance from the Engineering or Implementation teams. For this, as soon as a new user signs up on the platform, we need to provision the application and database instance within minutes of signing up. This was done using the automated scripts to provision an application instance from a pre-configured base image quickly. Also, a unique URL for the application was generated using AWS Route 53. Once the provisioning happens, the user is notified that he/she is ready to use the platform with his unique URL (user-specific or organization specific).

Customization

Architecture should support the customization of different business entities without any customer-specific deployment from the engineering team. These customizations were provided in the application via a configuration dashboard, wherein an admin user of an organization will set the configuration parameters based on the organization’s needs.

Hardware Utilization

The new architecture should be optimized for hardware availability. While there will be existing customers with huge data sets and customizations, there will be some customers with few data and almost zero customization. This was done by analyzing the costs of cloud infrastructures like instances, database servers, etc. and preparing the pricing plans for end-users accordingly.

Security

A lot of security aspects were already handled by isolating data as well as application runtime for each customer. The data in transit was over HTTPS only. The application itself provides secure access for all customer data.

Cloud

In the current example that we are discussing, the customer wanted to modify the existing platform in the form of “Portfolio as a service”, and they didn’t want to manage infrastructure and hire an admin for management. The implicit requirements were complete automation of provisioning, which was achieved with a one-click deployment for the product to provision application and database instances within no time. The architecture was built around multiple clusters so that all customers have their own runtimes (applications) and database server and no sharing of data or applications are entertained-

Design

As demonstrated in the diagram given below, on every new customer onboarding, our automated services created keys and did provisioning of applications and databases as per the pricing plan adopted by customers. Once this is step is completed, they could immediately start using the platform.

For every customer’s request, the load balancer identifies the right IP address of application to process. Thereafter, the application gets fully-encrypted data from an isolated database, decrypts data using the keys, and sends it back to the user.

Advantages of SaaS Type 2 Architecture

They say- sometimes it’s the smallest decisions that can change things forever. In our case, it was our decision to probe the customer’s case up close and choose the right SaaS architecture type that would serve their purpose well. Some of the advantages that the customer enjoyed-

  • Security is handled at the infrastructure level so that the application doesn’t have to take care of data sharing.
  • No necessity of managing connection pools for tenant-specific databases.
  • Low chances of the system’s underutilization as scaling can be done differently for different clients.
  • Faster customer onboarding is possible since no tenant-specific items are built.
  • The system can be customized as per user’s need without worrying about its impact on other users.

 Conclusion

We have customized customer onboarding in place, wherein customers can pick pricing plans as per portfolio size and the number of users, and our fully automated deployment solution provisions correct instances in the cloud and ensure optimization of the system. While SaaS type 2 architecture comes with several benefits, startups considering to implement it must be aware of the heavy investments on automation and monitoring that are tied to it.

Top Considerations while Implementing Blockchain

If you are seeing technology making a difference in the startup ecosystem, you might have seen a lot of hype around Blockchain. Innovative characteristics of Blockchain- decentralization, immutability, transparency, and automation- can be applied to various industry verticals, thereby creating a multitude of use cases.

Blockchain technology is still in its nascent phase and, while cryptocurrency platforms like Bitcoin and Ethereum have been in use since long, its adoption into the mainstream software industry has been limited. Having worked on Blockchain implementation for startups from various domains, I have tried to list down the top seven considerations while implementing a Blockchain in a product.

On-Chain or Off-Chain

One of the key architectural decisions while working on Blockchain-based products is- which part of the functionality should be implemented on the Blockchain and which is to be considered off-chain (i.e. on the centralized servers), both in terms of the transaction data and business validation logic.

The primary constraint is the network latency due to the data replication across the Blockchain network. The degree of latency keeps increasing with increasing levels of data replication. For the same reason, Ethereum charges a reasonable fee to store data on the chain.

Some general guidelines-

  1. Data that is either directly required for transaction validation or need auditability should be stored on-chain. All other types of data that are referential should be stored off-chain.
  2. In cases wherein eventual consistency is good enough, transactions can be carried out off-Chain, with only the first and last state being updated on-chain. This will increase overall throughput without utilizing additional network resources.

Public or Private Permissioned

Another important decision is the scope/access of the Blockchain itself, ranging between open & permissionless system to a private & controlled one. Public Blockchains are useful, wherein the users are to be kept anonymous and treated equally. Public chains require a community around them to ensure that no one person has the authority to change rules. They need to be community-driven, and a single user cannot change the rules of the entire network. However, a large number of nodes may limit the throughput of the transactions, and some incentivization is required to carry out effective processing.

Permissioned Blockchain platforms, on the other hand, control who can write/read on the Blockchain and are scalable when compared to public chains. They could be suitable where controlled governance is required, and compliance/regulations need to be followed.

An example of a public permissionless chain is Libra, a global payment system by Facebook, which can be used by anyone for value exchange. On the other hand, an Insurance claim processing platform is a good use case to exemplify private permissioned Blockchain. This categorization must be thought in the initial stages itself, as both the categories require different kinds of consensus and identity management solutions.

Levels of Security

Tamper-resistant, resistance to double-spending attacks, and data consistency are some of the desired attributes of a secure distributed system. While the first two can be achieved using cryptographic principles of Blockchain technology, an appropriate consensus mechanism is required to achieve consistency across the system.

In public-facing systems where anyone can join the network, all the nodes are trust-less with no one node having more privilege than others. In these scenarios, security is required against any malicious node, and a Blockchain with POW like the consensus is better suited despite the over-consumption of network resources and limitations in terms of transaction throughput.

In consortium like systems, multiple parties interact and share information. In these systems, although node identities are well known, only some nodes are fully trusted for processing the transactions, and security is required against the semi-trusted nodes or external users not directly participating in the network. A Blockchain, with appropriate governance model and consensuses like PBFT or POS, will not only provide the desired security attributes to the system but also increase the operational efficiency because of high trust levels.

In a document workflow-based application, for example, where documents are exchanged between multiple parties for approval, a system of later type can provide the required security and efficiency.

Data Privacy Needs

Sometimes, data stored or transactions executed on Blockchain need protection on account of confidentiality or compliance rules, and herein privacy comes into the picture. For instance- in the case of financial trades and medical-records based applications, transactions may need to be hidden with data visibility for selected stakeholders. Even in the case of bitcoin wherein transactions are done by anonymous users, transaction trend graphs may provide insights that can reveal the user’s true identity. These users may want to hide the beneficiary or amounts involved in these transactions.

Techniques like transaction mixing and zero-knowledge proof have been proposed to support that. Sometimes, there are variations in real-life situations where these techniques can’t fit directly and require the design of a new protocol using existing techniques.

Physical to Digital World Transition

Physical assets (like a land registry, physical objects, paper contracts, or fiat currency) can be represented into digital assets on the Blockchain and can benefit from decentralization. However, this requires an inherent trust in the system. We would either need a trusted third party providing this guarantee or a physical legal agreement between the parties that cannot be repudiated in the court of law.

In the case of fiat currency based applications, this trusted third party is a bank.  In that case, choosing a bank with good technical infrastructure is essential so that the Blockchain platform can be integrated with the bank easily.

Data Protection (GDPR)

GDPR compliance requires that a user can selectively reveal personal data to others and can exercise his/her right to the erasure of this data. As it is not possible to delete any data from the Blockchain, we should either keep such personal data Off-Chain (in centralized servers) or provide end-to-end encryption of his/her records so that it can be viewed only by that user.

Ease of Development & Deployment

Last but not least, we should have tools that ease out processes of development and deployment. A better smart contract framework means fewer bugs and more trust. A good container orchestration tool like Kubernetes is a must-have for upgrading the product on all the validator nodes.

Conclusion

Before building a real Blockchain-based product, you got to take a close look at the considerations mentioned above that can make or break your efforts. Stripping away all the hype and covering all the teething problems, I believe that blockchain technology is meant to revolutionize industries in a similar fashion of Big Data or any other emerging technology. Happy Blockchaining!

Does your Startup Really need Blockchain?

‘To Blockchain or not to Blockchain’ – this is one big question that has been on the minds of startup founders in recent times. From supply chain monitoring to equity management and cross-border payments, Blockchain has been making its way into multiple areas. Startups, to meet their growth goals, are jumping onto the Blockchain bandwagon to generate buzz, convince investors, and raise new rounds of funding.

Many startup founders approached us with a common question in the recent past- Is Blockchain the right fit for my startup? That triggered me to help them with a decision tree that will enable pragmatic decision-making in this direction. However, the number of startup founders reaching out to us with this dilemma kept increasing of late, which inspired me to write a detailed article on this.

Whether to adopt Blockchain for your startup is not merely a technological decision but also a business decision. Being the frontliners of decision-making, it is crucial for founders to not fall for the hype but diligently analyze if adopting Blockchain is right from the business perspective– even in cases where a well-defined problem exists. While Blockchain’s unique properties have forced startup founders to think of it as an essential and transformative technology, the ‘business benefit’ stands firm as a vital consideration in this decision. This article will cover both technology and business perspectives that founders need to consider while evaluating Blockchain.

Decision Tree: Evaluating the Technology Fit

Though many research papers feature decision trees to evaluate Blockchain use case feasibility with respect to technology, here is a simplified version of the framework-

Real-Life Use Cases

 For a better understanding of the decision tree, let me take you through some of the real-life use cases across different verticals-

 

Use Case Do we need to store the states?

(user specific data and/or meta data)

Are multiple users involved/ updating the stored states?

 

Is any trusted third part involved?

 

Can the third party be eliminated? Decision

 

Remarks
Social media application that involves user engagement and interaction Yes Yes Yes No No This is similar to a traditional centrally-managed application
Yes Yes The same use case can be implemented using Blockchain if and only if the control has to be released to the community
Food retailers receiving supplies from producers, wherein ensuring food quality is a key challenge Yes Yes No NA Yes
Organizations maintaining records of employee attendance Yes Yes Yes No No As long as there is mutual trust between organization and employees, there is no necessity of Blockchain. If any trusted third party is involved and Blockchain comes to picture, it would be mere over-engineering

 

Cost-Benefit Analysis: Evaluating the Business Fit

Every startup founder, who is planning to invest in Blockchain, should assess the ROI that will come from its implementation. You might be adopting Blockchain as a necessity or a differentiator for your product, but evaluation should always be done from a revenue generation perspective.

You might have to come up with a cost-benefit analysis as per your business, but I will help you with an example to better understand the approach. Let’s consider the case of food retailers mentioned above, wherein we would compare the high-level costs with different cost components.

Development Cost

If development efforts for building an MVP with a traditional centralized system approach were around X man-months, the efforts would be 30-40% higher in the case of a Blockchain-based approach, primarily for building Blockchain-based eco-system components. Usually, a Blockchain developer would cost you at least 1.5 times more than developers working on widely used technologies. This would make the development cost of Blockchain 2X higher than the traditional application development cost.

Infrastructure Cost

To evaluate the infrastructure cost, let’s assume the transaction volume of a few hundred transactions per second (TPS). If for a traditional solution the infrastructure cost is about X per year, it would be the same for a Blockchain-based approach. This is as per the assumption that nearly 8-10 nodes are part of the consortium. It boils down to one inference- Instead of a single party managing all the infrastructure nodes, every member of the consortium should own the node.

With the increasing transaction volume, the traditional approach can scale horizontally; however Blockchain-based solutions face the ‘Scalability Trilemma’. This is a famous term coined by Vitalin Buterin that, in layman terms, is akin to the phrase ‘you can’t have everything’. Businesses should clearly understand which aspect among the three- decentralization, security, and scalability- they intend to optimize and if that is in line with their value proposition.

Other Costs

A few other business efforts required in the case of Blockchain-based solutions include setting up the consortium, convincing the plausible members regarding benefits of joining the consortium, and expanding it to a level where it can be claimed as safe. Besides, it might also include devising legal rules and regulations to resolve conflicts.

When talking about benefits, a Blockchain-based approach can certainly enable business processes automation using smart contracts. The approach not only improves the overall process efficiency but also reduces operational costs for the businesses. This report [2] says that using Blockchain can minimize wastage of goods, which can result in savings of nearly 450K Euros annually. This value far exceeds the initial investment and operational cost that goes into a Blockchain-based solution. When the consortium further grows, the Blockchain-based automation protocols would enable business communities to define industry-wide standards.

Summary

Though it might not have garnered the importance that it deserves, evaluating the feasibility of Blockchain is highly recommended for startup founders. This article aims at busting the Blockchain hype and encouraging in-depth evaluation from an intersection of business and technology perspectives.

References

[1]   K. Wüst and A. Gervais, “Do you need a Blockchain?,” 2018 Crypto Valley Conference on Blockchain Technology (CVCBT), Zug, 2018, pp. 45-54, doi: 10.1109/CVCBT.2018.00011.

[2]  G. Perboli, S. Musso and M. Rosano, “Blockchain in Logistics and Supply Chain: A Lean Approach for Designing Real-World Use Cases,” in IEEE Access, vol. 6, pp. 62018-62028, 2018, doi: 10.1109/ACCESS.2018.2875782.

 

How to Build SaaS Application with Data Isolation but No Run-time Isolation?

As you have already considered SaaS implementation, we recommend choosing the right SaaS architecture type so that all the hardware and automation costs you bear are well optimized. In case you are considering SaaS type 3 architecture for your startup, you are at the right place to get started.

Type 3 SaaS architecture is the right fit for cases that require data isolation but no isolation. In this type, different data stores are placed for different customers; however, the application is shared by all. Type 3 SaaS architecture is considered in businesses like e-mail marketing, content management systems (CMS), health care applications, and so on.

For your understanding of the type 3 SaaS architecture, I will take you through the example of an innovation management platform that I worked on for a fast-growing startup. The platform enabled industry leaders to tap into the collective intelligence of employees, partners, and customers, find the best ideas as well as make the right decisions. This platform drove innovation through the following-

  1. Employee engagement: Making ideation a part of daily lives and creating a culture of innovation
  2. Continuous improvement: Supercharging project discovery by tapping into the employee bases
  3. Product development: Creating the next big thing with people who understand the business well
  4. Customer experience: Engaging a wider workforce and reduce customer churn

It also enabled enterprises to manage the entire idea lifecycle, right from coming up with an idea of delivering impact at scale. Now, you must be wondering why we chose SaaS for this platform? The platform had to be made available as a service to enterprises with an option of subscription for a limited period. Herein, hosting/licensing wasn’t a viable option, taking into consideration the cost of deployment, data privacy concerns, and the IT assistance involved. We picked SaaS Type 3 deployment model for this platform wherein we could keep data of each enterprise isolated from others, all the while retaining flexibility of application runtime being shared.

SaaS architecture

Fig 1- Saas Type 3 Architecture

How Our Decision Paid Off?

Having the right foresight and visualization is the key to good decision-making. That worked well in this case too, when we could rightly foresee the results of deploying SaaS type 3 on this platform. This decision helped us address the areas mentioned below-

  • Data isolation
  • Server utilization, wherein we kept application runtime shared to use the server capacity optimally
  • Separating application runtime to the high-end server for some high-paying customers

What are the Challenges We Overcame and How?

Isolating data for each customer by having separate databases, all the while sharing a common application runtime, was a critical challenge that we tackled. In other words, we got one application runtime capable of supporting multiple databases for customer-specific data management. Along with this, we also had to accelerate customer onboarding in less time. This implies the deployment process should be automated enough to handle database provisioning, disaster recovery, and rollout of new versions.

Supporting Multiple Database Connections

As explained earlier, we had one application runtime that supported multiple databases for the respective customers. In our case, we had built N-number of Tomcat web applications deployed in one server that shared the common application runtime. This way, every customer had access to an independent application, with every application having its connection pool to manage connections. However, a plan of merging these deployments to one application is underway, so that we don’t have to run duplicate processes.

Faster Customer Onboarding

We brought down the customer onboarding time by automating the database creation with templatized data using Chef scripts. Apart from the database creation, it was also essential to set up a backup-recovery process and failover & load balancing for the application, which we could achieve by using the cloud solutions and Chef scripts.

Effective Disaster Recovery

As the solution helps in innovation management, the data was highly critical to our customers. This implied that our system should be able to weather any unexpected disasters and unforeseen accidents. To handle this, we had deployed the application & database across multiple availability zones that ensured timely updation of application and copies of the database whenever the primary DS is down.

Automated Deployments

For a new version rollout, along with the application deployment, we had to deploy a new version of the database or upgrade the existing version for each customer. However, with one-click deployment automation that we had in place, we could safely upgrade all customer applications to the new version all the while ensuring the existence of a recent backup in case of a rollback.

Utilizing Hardware

As we had an isolated database for each tenant, we had to spin up multiple DB servers for each of them, and this was more of a requirement rather than a choice. But since the application runtime can be shared, we had options of hosting it in a single server depending on the usage. By grouping customers based on utilization, we could reduce the number of servers and, in turn, accelerate the usage.

How did we Ensure Security?

As stated earlier, we isolated data for each customer by having a separate database all the while sharing a common application runtime. This came with the additional baggage of securing the application runtime that would restrict the urge of end-users to access other end-users’ data points. How did we implement this? Here’s how-

  • Maintaining separate configuration keys for each customer and rotating them on every release
  • Preserving encryption keys of databases fields for each customer and rotating them on every release

Apart from that, there were many other security compliances we had to follow-

  • Our product was independently audited on an annual basis against a rigid set of SOC 2 controls
  • We have an open policy that allows our customers to perform penetration tests of our service
  • Our production environment is protected by a robust network infrastructure that provides a highly secured environment for all customer data
  • Data in transit is over HTTPS only and is encrypted with the TLS v1.2 protocol. User data, including login information, is always sent through encrypted channels
  • The hosting environment is a single isolated database and application components that ensure segregation, privacy, and security isolation in a multi-tenant physical hosting model. Instead of storing user data on backup media, we rely on full backups that are shipped to a physically different co-location site
  • Customer instances, including data, are hosted in geographically disparate data centers. Customers may choose the location to host their data based on the corporate location or user base location to minimize latency
  • We support Single Sign On (“SSO”), using the Security Assertion Markup Language (“SAML 2.0”). This allows network users to access our application without having to log in separately, with authentication federated from Active Directory
  • An automated process deletes customer data 30 days after the end of the customer’s term. Data can also be terminated immediately depending on the contract terms of the agreement.

Conclusion

Despite the above challenges, this model helped us live up to the promise made to the customer, i.e. ideas across enterprises remain isolated and high-security compliance remains ensured for every customer.

Distributed transactions are Not Micro-services

(A Quick Note for the Readers- This is purely an opinion-based article distilled out of my experiences)

I’ve been a part of many Architecture-based discussions, reviews, and implementations, and have shipped many microservices’ based systems to the production. I pretty much agree with the ‘Monolith first’ approach of Martin Fowler. However, I’ve seen many people go in the opposite direction and justifying the pre-mature optimization, which can lead to an unstable and chaotic system.

It’s highly important to understand if you are building microservices just for the purpose of distributed transactions, you’re going to land onto great trouble.

What is a Distributed System?

Let’s go by an example, in an Ecommerce app this will be the order flow in a monolithic version

In Microservice version, the same thing will be like this

In this version, the transaction is dived into two separate transactions by two services and now the atomicity needs to be managed by the API controller.

You need to avoid distributed transactions while building microservices. If you’re spawning your transactions in multiple microservices or calling multiple rest APIs or PUB/SUB, which can be easily done with in-process single service and a single database, then there’s a high chance that you’re doing it the wrong way.

Challenges in Using Microservices to Implement Distributed Transactions

  1. Chaotic testing, as compared to the ones in in-process transactions. It’s really hard to stabilize features written in a distributed fashion, as you not only test happy cases but also cases like service down, timeout, and error handlings of rest APIs.
  2. Unstable and intermittent bugs, which you will start seeing in production.
  3. Sequencing, in real word everyone needs some kind of sequencing when it comes to transactions, but it’s not easy to stabilize a system that is asynchronous (like node.js) and distributed.
  4. Performance, which is a big one and is a by-product of premature optimization. Initially, your transactions might not handle big jsons, but might appear later, and in-process where the same memory is accessible to subsequent codes and transactions, in microservice world where a transaction is distributed it could be painful (now every microservices will load data, serialize and deserialize or same large Db calls multiple time).
  5. Refactoring, every time you make changes in the design level, you will end up having new problems (1-3), which leads to engineering team a mod “resistant to change”
  6. Slow features, the whole concept behind microservices is to “build and deploy features independently and fast” but now you may need to build, test, stabilize, and deploy bunch of services and it will slow down
  7. Unoptimized hardware utilization, there is a high chance that most of the hardware will be under utilized and you might be start shipping many services in same container or same VMs, resulting in high I/O. Suddenly if some big request comes into the system, it could make it go hyper utilized, which will then make you separate that component out, further making the system under-utilized if these kind of requests are not coming anymore, and now there will be a team to handle this infinite oscillation that could have been avoided.

Do’s & Dont’s for Building Microservices from Scratch

  1. Don’t think of microservices as an exercise similar to refactoration of code in different directories. If some code files seem to be logically separated, it’s always a good idea to separate them in one package, however, to create a microservice herein is nothing but premature optimization.
  2. If you need to call rest APIs to complete a request, think twice about it (I would rather recommend to avoid it completely). Same goes for a messaging-based system before creating new producers and consumers, try not to have them at all.
  3. Always focus on different user experiences and their diverse scaling requirements, like for e-commerce vendors APIs are bulky and transactional, as compared to consumer API, it’s a good way of identifying components
  4. Avoid integration tests (yes, you just heard it right ). If you create 10 services and write hundreds of integration tests, you’re creating A chaotic situation altogether. Instead, start with 2-4 services, write hundreds of unit tests, and write 5 integration tests, which I’m sure you won’t regret later.
  5. Consider batch processing, as this design would turn out to be good in performance and less chaotic. For instance, let’s say in e-commerce, you have products in both vendor and consumer databases. Herein, instead of writing distributed transactions to make new products in both the DBs, you can first write only in the vendor DB and run batch processes to pick 100 new products and insert them into consumer DB.
  6. Consider setup auditor or create your own, so that you’ll easily be able to debug and fix an atomic operation when it fails instead of looking into different databases. In case you wish to reduce your late-night intermittent bug fixes, set this early on and use in all the places. So, the solution could be like this 
  7. I would recommend to overlook synchronizing. I have seen many people trying to use this as a way to stabilize the ecosystem, but it introduces new problems (like time outs) then fixing. In the end, services should remain scalable.
  8. Don’t partition your database early, if possible every microservice should have its own database but not all of them need databases. You should create persistent microservices first, and then try to use them inside other microservices. If your most/all microservices are connecting to Databases then it’s a design smell, scale the persistent microservices horizontally with more instances
  9. Don’t create a new git repository for new microservices, first create well unit tested core components, reuse (don’t copy) them in high level components, and from a single repository you might be able to spawn many microservices. Every time you need same code in another repository don’t copy them, rather move it to core component, write super quick unit test, and reuse in all microservices.
  10. Async programming , this can be a real problem if transactions are written in proper sequence handling . there might be some fire and forget scenario could have come which might not impact in normal scenario but in regress or heavy load these fire and forget might not even exected ) lead to inconsistent scenarios.

Check above example here developer thought calling sendOTP Service don’t need to synchronize and did classic “fire and forget”, now in normal testing and low load OTP will be send always but in heavy load sometime sendOTP would not get chance to execute .

Microservices Out of Monolithic: A Cheatsheet

  1. 1-5 of the above-mentioned are applicable
  2. Forget big-bang, you have a stable production system (might not be scalable though)and have to still use 50-70% of existing system in new one.
  3. Start collecting data and figuring out pain points in the system, like tables, non-scalable APIs, performance bottlenecks, intermittent performance issues, and load testing results.
  4. Make a call over scaling by adding hardware vs optimization, however, there’s cost involved in both the cases and you’ll have to decide which is lower. Many a time it’s easier to add more nodes and solve a problem (optimizing the system might involve development and testing cost which might be way higher than just addng nodes).
  5. Consider using the incremental approach. For example, let’s say I’ve an ecommerce app that is monolith (vendor and consumer both), and I come to know that we will be scaling with more new vendors in the coming six months. The first intuition would be to re-architect, however, in case of incremental approach you will determine that your biggest request hit will be from consumer side and product search. The product catalogue will need to be refactored, so you will not change anything in the existing app and it will work as it is for all vendors APIs and consumer transactions. Only for the new problems you will be creating another microservice and another db, replicate the data using batch processing from primary DB, and redirect all search and product catalogue APIs to new microservice.
  6. Optimization, you’ll have to shift your key area of focus on optimizing problematic components (scaling with adding more hardware might not work here).
  7. Partition of your DB to fix problems (don’t ignore this). Many people out there might not agree to this but you need to fix the core design problems instead of adding a counter mechanism like caching.
  8. Don’t rush into new techs and tools, you should be using when you have enough expertise and readiness in your team. Always pick stable opensource small projects instead of the new, trendy library or framework promising too many things.

Still Distributed Transactions in Microservices? Here’s the Way Forward

  1. Compositions, if you think you should merge couple of microservices or integrate transactions in one service, it’s never late to do this exercise.
  2. Build consistent and useful audit for transactions, and make sure you always capture audits even your service gets timed out. A simple example of setting up elk stack, structured logs with transaction ids, entity ids and ability to define policies that will enable you to trace your failed transactions and fix them by data operation teams (this is supercritical). You need to enable them to fix these, if it comes to engineering team then your audit setup is failed)
  3. Redesign your process for chaos testing. Don’t test with hypothetical scenarios (like killing a service then see how other components behave), instead try to produce the situation or data or sequences which can kill or time out a service and then see how resiliency/retry works in other services.
  4. For new requirements, always do estimates, impact analysis, and build an action plan based on your testing time and not development time (since now you will spend most of the time testing).
  5. Integrate a circuit breaker in your ecosystem, so that you’ll be able to check whether all services- the ones going to participate in these transactions- are live and healthy. This way you can avoid half-cooked transactions big time even before starting the transactions.
  6. Adopt batch process, wherein you convert some of critical transactions in batch and offline to make the system more stable and consistent. For example, for the e-commerce example mentioned above, you can use the following-

Here you will still get scaling, isolation, and independent deployment but batch process will make it far more consistent.

  1. Don’t try to build two-phase commit, instead go for an arbitrator pattern which essentially supports resiliency, retry, error handling, timeout handling, and rollback. This is applicable for PUB-SUB as well, with this you don’t need to make every service robust and just have to ensure that arbitrator is capable of handling most of the scenarios.
  2. For performance, you can use IPC, memory sharing across processes, and TCP, if there are chatty microservices check for gRPC or websockets as an alternative of rest APIs.
  3. Configurations can become real nightmares if not handled properly. If your apps fail in production due to missing configuration and you are busy rolling back, fixing and redeploying, you would require something else here. It’s very hard to make every microservice configuration savy and you can never figure out all missing configurations before shipping to productions. So, follow this

Hard code à config files à Data bases à api à discovery

  1. Enable service discovery, in case if you haven’t.

Conclusion

You can use microservices but must also have the pitfalls in the back of your mind. Avoid premature optimizations, and your target should be building stable and scalable products instead of building microservices. Monolith is never bad, however, SOA is versatile and capable of measuring everything. You don’t require a system where everything is essentially microservices, rather a well-built system with combination of monoliths, microsevices, and SOAs can fly really high.

How to Choose the Right SaaS Architecture for Your Startup?

Having worked as a solution architect and designed multiple SaaS applications over the years, I could see most startups struggling to choose the right SaaS architecture for their product offering.  

In this article, I’ve compiled all my learnings into a cheat sheet to help startup founders, who’re looking to build SaaS applications,  make a pragmatic decision backed by proven facts and data 

How does SaaS Architecture Impact Pricing and Profitability? 

Customers are increasingly choosing the ‘pay as you go’ pricing modelas this model offers flexibility as compared to one-time pricing model. In order to enable ‘pay as you go’ for your customers, you need the right architecture to support it. When we say the right architecture, it should allow your startup to track usage of services and offer customers the flexibility of managing infrastructure as per their requirements.  

A poorly designed architecture creates limitations in setting the right pricing strategy for the offeringsthereby impacting the acquisition of new customers. On the other hand, a good architecture not only helps in setting the right pricing model but also accommodating special architecture-design requirements, such as scalability and customizability. For having a clear idea of pricing model before setting up SaaS architecture, a startup needs to get answers to these questions-  

  • How would your customers pay? 
  • For what services (computation and values) would the customers pay? 
  • How will the usage be measured and invoices be created for the customers? 

In a SaaS setup, costs incurred in managing operations impact the profitability to a large extent. The optimization of operational expenses involved in managing the SaaS model depends on three crucial factors – infrastructure cost, IT administration cost, and licensing cost. 

However, the bigger question is- how do you ensure that these costs are well-optimized and priced correctly? I’ve listed below a few examples to demonstrate the same 

Salesforce Online- Salesforce provides a lead management system for sales and marketing teams for enterprises. The online version uses cloud (so none of their customers needs to worry about hardware and IT procurement) and chargecustomers based on the size of sales and marketing teams (so that they don’t have to pay one-time high license cost). 

SQL Azure- SQL server is the industry leader when it comes to RDBMSand provides a hosted solution wherein customer needs to pay high license cost, hire a DBA for regulating backup, geographical replication, and disaster recovery (important for databases). But, Azure SQL is a cloud-based system that’s accessible online where you only pay for storage and IOPS, with rest is taken care of by Azure (cloud provider). 

WordPress – Every enterprise needs a content management system, and WordPress has been at the forefront of this. WordPress provides an online platform with white-labelled solutions, customization, and multiple integrations for their customers. WordPress collects customer usage data and charges on the basis of it. 

Why is it Important to Pick the Right SaaS Type 

This might a common question popping up in your mind. Let me explain with two different examples-  

Example 1- Let’s consider an instance where a startup introduces isolated application VMs (Virtual Machines) for all its customers. In the majority of cases, these boxes will remain under-utilized. With customers paying only for utilization, the startup could end up with huge losses.  

Example 2- Consider a second instance where all customers of the startup share the database servers & application servers and paying only for utilization. This is a fair game for the startup as all of the hardware and automation are being properly utilizedHowever, in case of sudden increase in the utilization of these servers by one of the customers, other customers might have face performance issues and unexpected breakdowns.  

When a startup starts building a SaaS application, it essentially bears the hardware and automation costs. Hence, it’s crucial to ensure that all of the above-mentioned costs are well optimized by picking the right SaaS architecture based on your offerings. Damage control is still possible in the above-mentioned examples but you will definitely lose a big share of time and opportunities 

What are Different Architecture Types of SaaS Applications? 

Type of SaaS Architecture

 

Type 4 (Doesn’t require data & runtime isolation) 

This is the most basic type of SaaS application. In this type, you assume all of your customers to grow uniformly, and accordingly, the customer ID is created. These customer IDs are added to all of the tables/collection and all customers share the database and application hardware.  

Type 3 (Requires data isolation but no runtime isolation) 

This is one-step advanced as compared to type 4. In this type, different data stores are put in place for different customers, however, the application is shared by all the customers.  

Type 2 (Requires data & runtime isolation on the cloud) 

This type involves separate applications and separate data stores for all customers. In this case, the cost of isolation is typically passed on to the customers. 

Type 1 (Requires data & runtime isolation, but not on the cloud) 

This type is a version of type 2 wherein customer wants data to be stored on their own network and not on the cloud. Herein, the customer still opts for pay as you go or pricing model that’s based on users/featureas per on-boarding.  

How to Pick the Right SaaS Type for your Startup? 

Depending on the type of industry and nature of data, a customer’s requirement for security and shareability varies. A startup can try to understand the needs of multiple customers, and refer to the flowchart given below to select the right SaaS architecture type- 

On a Final Note 

As the startup growsit isn’t easy to mold the existing architecture to accommodate demands from the growing user base. Hence, it’s always good to choose the right SaaS architecture at the start, so that you don’t lose out on business because of rigid architecture. 

5 Notions That Could Spoil Your Great Product Idea

You know you have an incredible idea with which you could disrupt the market. But to do that, you need first to transform it into a software product. And you’re possibly wondering what the starting point is, and at what stage you need to take to the market. You’re worried because you know of people who did not make an impact despite their great idea. These thoughts are but natural and every entrepreneur goes through them.

I’ve had the fortune of working with multiple founders at various stages of their product, from bootstrap to scale and even through rounds of Venture Capital Funding. And through the years of building products for entrepreneurs and conducting Ideation Workshops, I’ve come across some typical misconceptions about Product Bootstrapping I’d like to share. These notions could potentially spoil your fantastic product idea.

“I’ll marry two products to create a new one.” 

People tend to use multiple applications to address their different needs. And, you feel that your product would give them the best of two or more worlds. It seems quite apparent that people would gravitate towards your unified solution instead of multiple products to fulfil their needs. This is not a wrong notion. It has worked remarkably well in the space of bill payments and bookings. However, you must not underestimate the products you are trying to marry.

For example, a platform that combines Facebook and TikTok to form a third application might sound like a great idea theoretically. But users are already accustomed to these features that have evolved to perfection over years, therefore you need to set aside that kind of time and money for your application. If you take a step back, you might realize that this is not how your idea first took shape. It all possibly began by assessing a specific user need and providing a solution for that. Simply put, it’s essential to do one thing and do it well rather than offering a suite of features that individually pack no punch.

“I’ll launch only when my product is highly feature-rich.”

At every stage of product development, it might feel like what has been built is not enough, “We need more power, Scotty!”. Also, the realization that all funds have been exhausted in the process of developing the product and its several features is the worst nightmare for any founder. This is where the power of a Minimum Viable Product (MVP) kicks in. It is crucial to scope your MVP and take it to the market on time.

Here’s a quick cheat-sheet to channel your MVP scoping:

  • Identify the type of users that will use your system – the user personas (keep this to the bare minimum).
  • Assess what each persona’s most critical needs are.
  • Drive the features and user-flows based on the personas and their needs.
  • Decide if it needs to be a Mobile Application or a Responsive Web Application.
  • Get this roughly estimated as a function of UX and Engineering.

Now, here’s the hard part – you’ll realize that this is too much in terms of time and effort. You will have to go back and trim the scope. “Trimming” doesn’t necessarily mean cutting out features, but just a matter of simplifying the process. For example- as an admin, you could try and do a few backend tasks manually at the beginning instead of having a fully automated system in place. Once you have your well-scoped MVP that has your core value proposition at the heart of it, you can easily plan to take it to the market predictably.

“The market is large enough, and I don’t need a differentiator.”

Many a time, I’ve heard very compelling thoughts around the key numbers associated with a product – the user base. The math around this would start with a pretty vast target market, followed by something like – “even if we just had 1% of that as our market size, we’d be sorted”. These numbers might seem accurate, but users seldom start using a product just because it exists.

As I mentioned earlier, everything begins with a need. A product that has been built to address that need uniquely is the differentiator that will drive customer adoption. This differentiator will also go a long way in adding value to your sales and marketing efforts. Though your target market might be large, your product could end up having the effect of a pebble thrown into the sea. The differentiator is everything.

“I must have AI and ML handle most operations.”

Everyone is excited about Artificial Intelligence and Machine Learning. They are here to stay, no doubt. You know that the roadmap of your product includes automated capabilities that are beyond a simple rule-based engine. Isn’t it best to build that capability from day 1, so that when you hit the market, you arrive in style? Well, there are two scenarios, and you will have to think hard about which bucket you fall into.

The first bucket is a product that can provide the solution to the user’s need through a rule-based engine. However, over time, it would be better for the operations to scale if AI and ML programming is in place. An example of this would be a sophisticated decisioning engine. The second bucket is where your product’s core value proposition is based on AI and ML. For example, a system that uses advanced facial recognition to cross-reference a vast database for identifying crime suspects.

Did you spot the difference here? In one case, without AI and ML, there is no product, but in the other, you can build the MVP without AI and ML and, over time, introduce that capability. Remember, Machine Learning is experimental and is going to need a lot of data and time to get to production. You must prepare yourself from a cost and time perspective to include it in the get-go.

“I must pick a VC-Friendly tech-stack.”

It’s essential to ensure that you don’t eventually face a VC with an un-cool technology stack, right? Well, every programming paradigm is meant to be used in a particular way. Each one has its own advantages and disadvantages, and more importantly, a purpose. But there is nothing called un-cool tech, only an un-cool notion of looking down upontechnologies because they maybe a little old. Remember, with age comes maturity. It would be best to choose your tech-stack based on the nature of the platform.

For example, you’d need to look at the kind of client-server chatter you’re expecting, the nature of data – structured or unstructured, the intensity of server-side processing & IO operations, and the sort of activity you expect on the UI, to name a few. A great example of a “Tech-pick for the VC” is the Blockchain. Blockchain is an excellent distributed, immutable, and secure network and is advantageous in a host of use-cases. But it can’t be crammed in a product where it does not belong with claims of replacing the database. I know you’re smiling, but I’ve actually heard that one.

On a Final Note… 

I’d like to conclude with an altered dialogue from the film For Vendetta. “Beneath the hood of your software product, there is more than just technology and frameworks. Beneath the hood, there is an idea. And ideas are bullet-proof.” I hope this article was helpful in re-addressing and fortifying your core idea. I wish you all the very best in transforming your idea into a successful product.

Does Team Ramp Up Guarantee Increased Feature Velocity?

When startups move from early to the growth stage, their priorities change by a significant degree. Among all others, feature velocity stands out as a priority and plays a crucial role in their pursuit of growth.

The founder of a seed-funded startup that I was working with, which recently raised its series A round, asked me to double the engineering team to release new functionalities in six months. However, is that the only relevant factor in reducing cycle time? To answer, I decided to address the ignored aspects behind this prevalent ‘misconception’.

Though resource capacity is one of the critical factors driving frequent releases to production, this alone can’t guarantee reduced cycle time. While building products for more than 16 startups, I’ve witnessed the transition from early to growth stage multiple times. Out of this experience, here are some crucial factors that I am sharing for startup founders to consider-

Pass On Clarity: The Top-Down Essential

One of the key essentials while ramping up teams to increase release velocity is the transfer of clarity from product teams to sprint teams. When a sprint starts with ambiguity or doesn’t have enough stories to start with, the sprint delivery ratio gets adversely affected. Vaguely-defined stories or mid-sprint inclusion of tasks slow down the pace, as a result, the sprint teams fail to deliver as planned.

A smooth sprint execution is the outcome of notable contributions coming from both the product and sprint teams. The product team can play its part by preparing and sharing the roadmap to the sprint team at least for the quarter so that the team can plan its deliverables accordingly. On the other hand, sprint team should keep pushing the product team to get a product backlog and groom it on a weekly/ biweekly basis to ensure focused delivery.

Automate to Release ‘Bug-Free’

“Efficiency is doing things right; effectiveness is doing the right things.” – Peter Drucker

When you think of automation, it is an example of the latter category. The moment feature development picks up speed, it is highly likely that production will break down unless the right processes are in place. If the product isn’t stable enough to handle new feature developments, your team spends more time fixing issues than building new features. Consequently, your engineering velocity comes down.

This is where CI/CD (continuous integration and continuous deployment) comes into the picture. Herein, exhaustive unit, integration, and automation test coverage ensure that whatever is shipped doesn’t break the system.

Don’t Just Build More, Else You Break More

Rework is a big productivity killer and could be a result of various factors such as vaguely-defined stories, lack of dev testing, lack of test coverage, and so on. Rework can eat up productivity as it would consume the time and efforts of QA engineers in testing and regressing, developers in debugging, and release managers in re-releasing. Slowing down a bit can help your team deliver faster and add value, as the effective speed is always highly rewarding than just the speed.

The philosophy of ‘first-time-right (FTR)’ helps all team members to align to a common goal- delivering robust and stable code in the first time itself. It’s always healthy to spend some extra time to determine the quality of code, rather than hurrying, and then getting stuck with rework. Some of the tried-and-tested ways to improve FTR ratio are regular backlog grooming, reiteration of stories, regular demos to product manager. Instead of just gathering requirements, the sprint team should be more focused on elucidating them to improve the FTR ratio.

Structure Your Team for Parallel Sprints

When your startup has a small product team, everyone mostly works on one or two features at the same time (generally applicable for a team of 4-6 members). However, as the expectation goes up to deliver multiple features, it’s highly recommended that you form multiple sub-teams having distinct focus areas. In this manner, every sub-team gets to run its sprint and define its roadmap.

As compared to one big team, smaller teams born out of a ‘logical separation’ framework are more effective and yield better results. Individual teams for microservices, different product lines, and various components are among a few examples of the ‘logical separation’ approach. During the restructuring, it’s always essential to include at least one member from the earlier core team in each of these sub-teams to maintain the DNA. Though cross-team coordination for deliveries entails an additional overhead, that’s a justified trade-off.

Track Feature Usage Along with Velocity

User experience is the most vital metric to measure the success of a new feature release. As you start delivering multiple features at speed, user experience often takes a backseat. When your product has limited features, the user interaction continues to be a smooth unbroken curve. However, when you start releasing new features, there is a high chance of users getting overwhelmed and their experience getting impacted.

To achieve better user adoption, tracking user engagement along with feature velocity continues to be the best way forward. While exhaustive user research is one proven way, other significant ones are rolling-out initially to selected users via feature flags, A-B testing, and tracking user journey (via amplitude or similar analytics tools) after every new release.

Don’t Lose Your Core Members

This might be a very commonly-ignored aspect yet stands firm as a very crucial one. Small teams don’t necessarily need processes and have a nimble structure, and everyone’s voice is heard. When these teams move up to a state where processes are set up and new engineering & functional members are added, healthy management is the only way to avoid chaos.

As your engineering team successfully scales up, a well-capable product team is essential to feed the engineering team continuously. A churn becomes inevitable when the team members have no significant work, but no startup would ever want to lose its core members. In this case, senior management holds the key to ensure good relationships with people and understand the dynamics well.

On a Final Note

The learnings that I’ve shared here are from the experience with multiple startups over the years. I expect it to be useful for startup founders, who already have a lot on their plate, in a way that they don’t end up reinventing the wheel.

How to Scale an Engineering Team in a Startup?

After working with startup founders closely across stages, I have realized that scaling is one aspect they are often concerned with. During this journey, leading engineering teams from the front led me to conclude that scaling an engineering team can become a bottleneck to growth.

I want to share my scaling experience through an example of a leading mobile advertising platforms where I had to ensure that engineering does not become a roadblock. We scaled the product to handle the transaction from 1 million to 80 billion impressions per day with the help of an engineering team that had to scale from 2 to over 100 members.

Translate business goals to technology deliverables

I was part of quarterly business planning meetings where we used to discuss quarterly roadmap and feedback from the customers. During one such meeting, one of the co-founders spoke about an idea to enable revenue stream through display ads in a non-internet environment.

Though the complete feature details were not frozen, I could see this feature as one of the critical elements in the next release. We anticipated the need to hire an additional Android and iOS developers required to deliver this functionality much in advance. Apart from identifying & acquiring new skill sets, it was essential to implement this change in the core product to handle such cases.

I feel that if you nail down the art of translating business needs to technology deliverables, you will get extra time for execution.

Polyglot engineers

As a precondition to close a deal with a premium publisher, we had to develop a feature to cap the frequency of advertising. The core product was written in PHP, and we didn’t have a dedicated PHP person at that time. We had no time to acquire it since it was an on-demand request. So, one of the senior Java developers with a fundamental understanding of PHP took it up as a challenge and delivered the feature on time which helped us in acquiring the premium publisher.

You can have the best recruiters on your side, but even they cannot salvage you in such situations.

Challenge team members

Our attrition rate for the team was Zero for 3-4 years. Beyond aligning team members to product vision & mission, keeping them challenged in their work helped us achieve this feat. There was a strong focus on quality and value addition to product for which one had to think beyond just the day-to-day deliverables. Staying on the top of technology helped us to provide innovative solutions to hard problems. This also kept team members engaged. Using Big Data technologies to solve analytics problem & real-time campaign pacing engine turned out to be game-changer for us. Often rotating people within the team helped in productivity and faster technological growth.

Top-down induction process

Often, organizations follow a bottom-up approach for induction. However, in our case, we followed a top-down approach where the leads are involved in explaining business goals and metrics to the new team members. They are closely involved in the induction process. Along with a top-down induction process, an evolving team structure is equally important. Initially, we started with two team members and gradually scaled to form sub-teams such as the engineering team and the customer focus team.

After scaling an engineering team for this startup, I was able to apply the same guidelines to scale multiple teams for other startups successfully.