PhD and Data Science: Challenges and Opportunities with Startups

This article attempts to summarize potential expectation mismatches graduates with a PhD degree experience while working in the software industry, specifically in startups having a Machine Learning and Artificial Intelligence (AI) focus. Some of the opinions expressed here are based on personal experiences and recent attempts to hire engineers with a PhD degree.

Startups Embrace the Artificial Intelligence Revolution

In recent times, Startups have been churning out complex products involving techniques that have mathematical underpinnings. The industry loosely refers to these techniques as AI. Purists, however, would argue that prevalent techniques in natural language processing (NLP), computer vision, speech processing and image processing that depend on statistical algorithms and machine learning algorithms are not really AI.

A decade ago, a PhD with an AI background in optimization, machine learning, computer vision, NLP, or pattern recognition was hardly considered for employment in Startups. However, over the last few years, there has been a remarkable change in the way these PhDs are being perceived, given the huge demand for Data Scientists in most spheres. And for the same reason, a PhD in computer science with a sprinkling of mathematics in their course work or in their thesis, seem to have an edge for the role of a Data Scientist today.

Hiring Demand for PhDs Soars

Given that a lot of startups are throwing in AI into their vision, it is natural that the engineering VPs driving these products wish to hire people who understand these data-driven or machine learning or AI algorithms in the context of Data (images, speech, telemetry, text etc) and Domain (FinTech, Wireless Networking, BioMedical imaging etc). Most of the products do not lend to black-box machine learning. The ideas are so unique that one needs to dissect known machine learning algorithms, modify them, create newer techniques, and combine multiple algorithms in order to deliver the final result. As these algorithms have mathematical foundations, people with a PhD degree in mathematical subjects are in huge demand. They are expected to not only understand the inner working of these algorithms, but are also expected to create new algorithms in the context of business problem, data platform and available data. Further, exploratory work is expected to be completed within time limits with the end result being software that can go to production.

Despite the requirement for PhDs with mathematical backgrounds, the number hired in startups is still significantly lower as full-stack developers are preferred over those with a PhD degree. One of the reasons could be the skepticism in the minds of engineering VPs with regard to a PhD’s capability to deliver “productizable real-world solutions”. With so much open-source software available, decision makers pressed for time think that integrating available algorithms into the product is a practical option, and full-stack engineers are adept at it.

However, if open-source could solve an out-of-the box problem, then every startup would end up building a similar product there by cannibalizing each other. Thankfully, many startups have a clear vision to be market leaders and are solving problems where critical thinking and grass-roots innovation are required in order to build the product. Therein lies a huge opportunity for graduates with a PhD, provided they are hands-on.

What does being hands-on entail?

It is the ability to convert theory into working code that powers the product feature. However, to power a product feature there are many steps to be followed and engineering practices to be adopted. In Startups, the role of a PhD is not limited to finding the solution to a problem. The PhD is expected to (A) build the solution in a working form, (B) adapt the solution to the platform and architecture (C) work with other engineers to get the solution into production (D) follow programming best practices (E) determine the solution approach within a short time span (F) find a solution that can be incrementally improved upon.

Note to fresh Graduates

Many fresh graduates on completion of their PhD believe that their thesis work is what the industry is currently looking for. I believed the same many years ago. However, barring a few cases, the PhD thesis has no direct relevance to the immediate needs of the industry. What perhaps is relevant is the training that one undergoes during the PhD programme. One needs to look at the 5+ years spent on a PhD thesis as a journey involving (a) learning and building domain knowledge (b) exploration (c) taking risks (d) making mistakes, and changing course when one hits a roadblock. Organizations frequently look for experiences that demonstrate the following

  • ability to identify the heart of a complex problem
  • ability to dissect a problem into smaller parts
  • ability to practically implement a solution for parts that stitch themselves into the whole
  • ability to improvise on a solution
  • ability to use knowledge in one domain to a different domain
  • ability to build a minimum viable solution as quickly as possible
  • ability to re-skill oneself

PhD hires are expected to display aptitude for incremental product feature rollout pretty much like the product development engineers. Designing a solution for incremental rollout with improving quality and accuracy is as much a skill as it is an attitude. One needs to develop the skill to dissect a mathematical approach into parts such that an approximate solution can be built within the shortest time.

Summary

PhDs have to choose between two worlds. One is an engineering world requiring product development skills and an execution mindset. The other is an academic world that provides research opportunities and a career in teaching.

The expectations in these two worlds are very different. Those wishing to enter the engineering world of startups need to be aware of the expectations in a product development setting. PhDs who want to work as individual contributors have an excellent opportunity for a long career in startups, even if it means moving between startups. They also have an opportunity to grow with the startup and move into leadership/execution roles if interested.

However, caution is prudent. Not all startups can be successful, but working in startups leads to the development of varied skills that other startups prefer in one’s resume!

 

Is building a startup like making a movie?

I work at Talentica converting my start-up customer’s ideas into products. I am a Senior Development Manager and have the goodwill of working with highly functional teams and amazing clients to create these products. Also, I love films, and a few years back, a group of friends and I decided to try our hands and Video Production. We agreed as a team that we would work on every element of the process – storyboarding, scripting, directing, filming, editing, mixing, marketing … the works!

 

Like any two processes that lead to creating something, there are similarities on the surface between Product & Video creation. Both start with an initial idea, this idea is then grown to become a full-fledged concept, then comes planning and execution. Through the journey, multiple feedback from several people is continuously sought and factored back into the execution process. Once the end-product is near completion, marketing activities begin. Eventually, the Product or Video is presented to the end consumers.

 

I, for one, am a filmmaking noob but thankfully my friends were more adept. However, I am experienced with music production. We worked on a whole bunch of short videos. Eventually, we saw success with a video we called “Indian Nod: Explained”. Through the course of creating videos, we learned that several factors lead to success. Amongst many, one of them emerged to be extremely crucial – a unified vision.

 

Let’s take a glance at a famous scene from Tarantino’s movie Inglourious Basterds – the one where a group of Undercover Intelligence Corps ends up playing a drinking card game. For those that haven’t watched the film, the situation is this – the scene is set in a restaurant where the Undercover Intelligence Corps are plotting to assassinate Hitler. Because of their shaky German accent, they attract the attention of a Nazi German Sargent who joins the group and subtly coerces them to play the card game. What is fantastic about this sequence is that it is 15 minutes long, and purely dialogue driven. But the tension it arouses puts you at the edge of your seat biting whatever is left after a quarter hour of nail-biting. The viewers know it in their bones that it is not going to end well.

 

Like any film, this scene had a solid team behind it starting with the director and writers, the actors, and the on-set crew. Then the footage was worked upon by multiple groups – editing, sound, production, etc. Now, many of these tasks are highly technical. It needs a sound understanding of the technology and tools. A film producer can put together the best team for his movie, but the critical factor that brings about success is that each and every person must align with is the final outcome – that ‘tension’ the scene needs to create. If the end-result is understood only by the writers and directors, it will result in failure. I’m sure you’ve come across this – “…the concept was nice, but somehow the movie didn’t come together”. Across all functions of the movie-making process, every player is vital and needs to be wired to that outcome.

 

This is no different from the thing that leads to successful products. The founders alone can’t be the ones with the vision for their product. The end objective should be sharp and clear in the minds of everyone working to create the product, business and technical folks alike. The vision must cut through the various layers – founders, product owners, UX, developers, QA, dev-ops, sales, marketing, et al. A sound process that can enable this a vital catalyst that ensures a great concept can come together to create a world-class product.

How to prepare for GDPR Compliance

What is GDPR?

General Data Protection Regulation (GDPR) requires businesses to protect personal data of European Union (EU) countries’ citizens. Organizations that collect and/or process data from the EU region must comply with the regulation by May 25, 2018.

What data are we talking about?

GDPR is here to protect the privacy of Personally Identifiable Information (PII) such as:

  • Name and address of a user
  • Web information like location, IP address, cookie data
  • Biometrics
  • Political opinions
  • Device IDs like IDFA, AAID

Who is affected?

All organizations that store or process personal information about EU citizens (no matter where on the globe they are), even if they do not have a business presence within the EU must comply with GDPR.

What if an organization is not GDPR compliant?

If any organization is not GDPR compliant by May 25, 2018, then the regulation attracts a penalty of up to €20 million or 4% of your global annual turnover, whichever is higher.

How do you prepare for GDPR?

The following checklist can help you comply with GDPR:

  • Identify if your customer’s product deals with EU data.
    NOTE: The common approach that most businesses are taking across industries is to proactively reach out to their entire user base and ask for their users’ explicit consent. That is a much easier and the most recommended thing to do. But, if for some reason this is detrimental to your business directly or indirectly, identify your available data.
  • If EU data is involved, identify the parameters involved. For example device IDs, email IDs, IP address.
  • Assess whether the data is really required for business. If the data is not required, immediately delete all such data and update your product to stop receiving such information.
  • If any data is essential, find a way to replace the data. For example, replace essential PII data with unique tokens or encrypt the data such that the token or encryption cannot be reversed to identify the person.
  • If you have been storing any PII data, update the data retrospectively.
  • Ensure GDPR compliance of all the entities involved in your business workflow.
  • In case of data breach, notify your data regulators within 72 hours of the breach.
  • Have a provision in your product to facilitate easy capturing of user preferences. This lets the user decide if they want to know the information your product captures, how is it used, and if they want your product to delete or restrict processing of their information.

What has Talentica done for GDPR compliance?

We understand that our customers are from both B2B and B2C scenarios. It is important for us to create tools or techniques for our customers to achieve GDPR compliance. We have been working to ensure all our customers are GDPR compliant.

Production-deployed customers:

For customers whose products are already in production, we perform the following:

  • Record Consent: Before storing any personal user data, we seek the user’s consent via consent forms and APIs.
  • Compliance Audit: We identify whether any users are in the EU region and whether they have provided consent for data sharing.
  • Consent Record Logs: We store an audit trail of when, where and how the consent was given. The logs include the link, text and a screenshot of the consent form.
  • Data Pseudonymization: We associate unique pseudo tokens for key personal identifiers like email, name and contact information so that there is no direct association with any personal information.
  • Data Encryption: As an alternative to data psedonymization, we encrypt all the personal information. Encryption also helps in unfortunate situations like data breach or hacks and cannot be used to re-identify a person.
  • Data Export: Our customers’ users can request an export of all data related to them. We notify customers about such requests and extract the required information securely.
  • Data Deletion: If any of our customer’s users ask for their data be forgotten, we purge the related data. We also disassociate any data pseudonymization for that user.
    NOTE: GDPR Law states that you should finish processing such requests within 45 days.
  • Thirdparty Compliance: We ensure our customers’ third-party integrations providers are GDPR compliant and they send us their user’s consent forms. If not, we urge them to comply with GDPR. If nothing works, we disassociate ourselves from such providers.

Infrastructure management:

For customers whose infrastructure is maintained by us, we perform the following:

  • Network Isolation: We isolate the infrastructure for EU users on the basis of the GDPR guidelines and setup the servers in EU region itself.

Conclusion

GDPR is unavoidable for anyone working with data about EU citizens. Taking care of this personal data is a responsibility that we share with our customers. A lot of the tools described here are also good practices that we should all be following irrespective of GDPR. By taking good measures, we can all ensure that our customer’s data privacy is fool-proof.

References

How to communicate with an OPD Partner?

Having spent 14+yrs working exclusively with start-ups and being a part of product development life cycles for over 30 startup products, I have observed reluctance from founders to outsource the core engineering part of their product. When I tried to dig out the reasons behind their reluctance most feedbacks pointed towards communication overhead, pace of the development and deviation from the expected outcome. But when I analysed further, I realised that those were only the consequences of the communication gap which in turn is a primary concern.  Continue reading How to communicate with an OPD Partner?

Best Ways to Select an Outsourced Product Development Partner for a Startup

Having spent a considerable time on the other side of the table, and being evaluated for the role of an outsourced product development partner by 50+ startup founders, I have learned some of the clever ways to hire the best suited Outsourced Product Development Team for a startup.

This blog of mine is an outcome of all the experience I have gained from my decades of working with some of the best minds in the industry. So, here are the Best Ways to Select an Outsourced Product Development Partner: Continue reading Best Ways to Select an Outsourced Product Development Partner for a Startup

Top 4 Questions To Answer Before Product Scaling!

On a recent trip to Europe, a prospect meeting reinforced a notion that I held to be important anyway! While most founders are quite confident in their product development capabilities, when it comes to scaling many of them are fundamentally unclear over how to go about it.

I have come across many entrepreneurs who believe that scale is something they need to factor in, at the very beginning. Be it their MVP or a major milestone release, I see most of them scurrying over the product scaling quandary from day 1. Unfortunately, this leads most of them to fail. Continue reading Top 4 Questions To Answer Before Product Scaling!

Top 5 Reasons Why Startups Fail

I’m no VC, but I’ve been running a company that helps startups build products for the last 15 years. We’ve worked with 100+ startups during this time – some exited through profitable acquisitions while some shut shop. But in the process, they’ve given me a few interesting insights. Some of these should be valuable to entrepreneurs starting off with their own new ventures.
Continue reading Top 5 Reasons Why Startups Fail