How Can We Reduce the Risk of Machine Learning Projects?

The overall risk of machine learning projects in a business can be relatively high because they tend to be long and complex. Before embarking on building a machine learning solution, you need to decide what business problem you are trying to solve and how machine learning fits in the solution. It’s a thought experiment, in which a magic black box provides a perfect prediction of whatever it is that needs to be predicted in order to solve a business problem. Imagine you have it. Then answer the following questions, which help identify and manage the risks.

Is this the right problem to solve – right now?

This is by far the most important question, and it lies squarely in the product management and business area. To answer it effectively, strong communication must be established between the data science team and the product and business teams. On one hand, it is important to have a process which enables ideas generated in the engineering world to be validated quickly with customers. This avoids situations when a product feature is developed on the premise of “Wouldn’t it be cool if…” and there’s no demand for the feature. On the other hand, there needs to be an efficient way to prototype and validate solutions for inbound ideas and requests from customers.

If we solved the problem, what would be the value?

The value is the net difference between the benefit of having the feature and the cost of building it. While it may be hard to estimate either of these quantities accurately, some idea of the financial or other impact should provide guidance on whether the benefit is worth the risk of investing in the project. The company is exposed to a potentially large opportunity and financial risk, for example, when a product feature is built without an estimate of either what it would cost to build and maintain, or what revenue impact it is expected to have.

Is machine learning the right tool to solve the problem?

Building a production machine learning system is hard and can be expensive. In many cases, an outcome that is good enough to solve the business problem can be achieved with simpler methods that are much easier to implement. During a panel discussion at H2O World 2015, Monica Rogati made this point beautifully: “My favorite data science algorithm is division because you can actually get very far with just division…” Understanding that using machine learning is not the goal and is not necessarily an appropriate tool can sometimes be disappointing to data scientists. However, the satisfaction of solving a real problem and having a business impact easily outweighs this disappointment.

What are the constraints?

Any project has its constraints, and machine learning systems are not an exception. The starting point is the available people and their skills. If the existing skills do not match the task at hand, the project will depend heavily on the ability to train, hire or outsource in order to fill the gaps. More on this in my post Four machine learning skills of a successful AI team.

Further, if machine learning models have to run in a production environment, the data scientists who build the models need to understand upfront the existing production environment architecture, the technology stack, and the requirements for scale, which typically limit the choice of programming language and algorithms that are acceptable in production. Scalability of machine learning systems is a large topic in itself, and I leave it for another blog post.

Summary

By solving the right problem, understanding the value of the solution, being confident that machine learning is the right tool to solve the problem, and defining the constraints upfront we drastically reduce the overall risk of machine learning projects. Answering the questions above helps avoid wasted funds, time, and effort, as well as frustration across the organization. Not all questions can be readily answered, and some discovery with customers and proof-of-concept projects may be needed. While it may appear as unnecessary extra work, the resulting clarity about the project is so powerful that it is worth the investment.

Photo by rawpixel on Unsplash

Copyright (c) 2018-2020 Sergei Izrailev. All opinions are my own.