GAINS BLOG

Blog: Successful ML Projects Require Narrowly Targeted Data Translations of High-Level Business Objectives

Or Successful ML Projects are Narrowly Targeted

By Dr. Chandler Johnson

successful ML projects supply chain

From the board room to the lab, it’s common to hear the question, ‘Why aren’t more machine learning (ML) projects successful?’ As companies invest more into ML, it’s a question that should be asked often. As a data scientist, I think a big part of the answer is that people don’t scope projects small enough: clear business objectives aren’t necessarily sufficiently refined data projects. Let me explain;

Many businesses use ML to try and solve colossal C-suite problems. The issue with this practice is that these are business objectives rather than data problems. ML is good at answering little questions like ‘What’s the lead time?’ but can’t answer ‘How can you squeeze two weeks of inventory out of the business.’ Achieving results from ML requires translating a business problem into a data representation. Only then can ML succeed and deliver a meaningful answer.

Why Most Machine Learning Projects Fail

One reason ML projects often fail is that somebody, typically a person in the C-suite, comes up with a strategic question; however, the data representation of that problem is much more granular. If you want machine learning or AI models to deliver value, users must ask extremely targeted, simple questions where the outcome is observed often, where each row/observation probably doesn’t matter much on its own, but where the outcome across multiple rows matters a lot. That kind of data lends itself to ML projects in supply chain that are likely to succeed. Some of the lowest-hanging fruit are potential projects related to parameterization or data quality.

To achieve a better likelihood of a successful ML project, there are two primary steps:

Step #1: Translate Business Objectives into Data Questions

If your strategic goal is reducing inventory, you must consider the data representation of your company’s inventory problems. To use machine learning to minimize inventory or increase service levels, you first need to identify the related data problem. The first issue that must be addressed is data quality. Suppose you look at 17 million demand transactions using ML. In that case, some data are likely wrong. We’ll first want to address the bad data, and ML can help there. We can use ML to flag bad rows, estimate missing data, or perform other narrow tasks.

Then, for your inventory reduction objective, you should proceed to a question about warehouse inventories. Suppose you have six warehouses and a dataset where each warehouse constitutes one row in your data set. If you ask that dataset, “how do I reduce inventory?” Even if the question were well-formed for ML (it’s not), you’ve only got six rows: you need more data. ML models are going to fail.

On the other hand, you could achieve the inventory-reducing business objective by asking a much narrower question: “what’s the lead time for my next order?” For that predictive task, you may have millions of observed training rows. And if you can better predict the lead time, you can better control safety stock and reduce inventory. Voila: the business objective is achieved by asking ML a narrow question with a lot of observed training data. Maybe you’ve squeezed two or three days out of inventory. Then you’ve accomplished the business objective of reducing inventory using data-driven prediction.

Unfortunately, there is a shortage of professionals with the necessary skills for this step. There are very skilled people in C-suites and some gifted data analysts. However, redefining a C-suite business objective as a data representation, a matrix where the rows and the columns are defined, is something that not many people can do. That’s why most of these ML projects fail. Some ML projects fail because of data quality or other factors, but many fail because there’s never a translation from the business objective to a data representation.

Step #2: Aim for Improvement, not Perfection

For the second step, success requires using the ML model’s insights to add value. Assessing the success of an ML project is also an exercise in translation but in reverse. The data problem’s results must feed back into resolving the overarching business objective. In other words, the ML predictions must inform a recurring process that generates the observations underlying C-Suite performance metrics. If ML is going to influence inventory by predicting lead time, ML’s lead-time predictions need to be fed back into the ordering system.

To measure the performance of the ML model, you have to go back to the business objective level and translate the findings with the business questions in mind. In other words, you must know, ‘What does success look like?’ You need to take the data representation and develop high-level performance metrics that everybody agrees on.

However, a word of caution. It’s critical to measure the model’s success based on something other than perfection but instead on what it competes against. For example, if you looked at lead time results and saw that the ML estimates could have been better, you may be tempted to call the exercise unsuccessful. But that’s not the correct performance metric to judge success. You want to look for improvement with ML, not perfection. To assess success, don’t ask if a model is perfect; ask if it beats the status quo process on relevant KPIs.

If lead times were predicted to be 100 days based on your historical analyses, 70 days according to the ML model, and the actual was 40, the ML prediction might seem pretty bad. But it was much better than the alternative! It may not be good, but it was better (and will improve over time). It’s essential to measure ML outcomes on improvement, not perfection; if it’s better than the existing performance, it’s likely worthwhile.

Defining an ML Question, A Day-in-the-Life Example

When starting an ML project, I encourage you to translate the business objectives into data problems. Write down one row and tell me what one-row looks like: what are the values in each field, and what are we trying to “do” for that row? Next, tell me the current process against which the ML process will compete. No ML model is a greenfield exercise: it’s competing against something, even if that something is just “no action” or “treat everything equally.”

For example, if you’re in marketing and trying to predict churn, you want an ML model to predict which accounts will churn and which won’t. The most common non-ML process is either targeting everyone or targeting no one. These baseline processes are equivalent to models that “predict everyone will churn” or “predict no one will churn,” respectively. Explicitly identify the status quo for comparison because ML can’t compete against perfection but improvement.

After running the model and interpreting the results back into strategic goals, the discussion with the C-Suite and other leaders changes from “why is the model only 99.5% accurate and why ‘Customer A’ left” to how the model reduced overall churn by 20 percent compared to the status quo. Not the perfect result, but a better result.

Next Steps

If you are considering using ML to solve some of your supply chain challenges, I hope you’ll consider these steps and start a journey that brings ML and AI insights into your decision-making. At GAINS, I lead a team of data scientists to translate business objectives into data representations and build models to address these business objectives via their data representations. It’s our passion, and GAINS customers benefit from the ML insights we deliver. To learn more, check out our GAINS Labs or contact us.


Chandler Johnson – GAINS Chief Scientific Officer

Affiliated with GAINS for two decades, Chandler joined the company in 2020 to guide the application of visionary methods like AI and machine learning to improve GAINS customers’ supply chain performance. Chandler is also an Associate Professor in BI Norwegian Business School’s Department of Strategy and Entrepreneurship. Chandler teaches machine learning at the executive, EMBA, and MSc levels.

Chandler holds a Sociology Ph.D. from Stanford, a Statistics MS from Stanford, a Social Science MA from the University of Chicago, and an Economics BA from the University of Washington. He has extensive operations consulting experience and began his career managing Beijing operations for a Shanghai startup.

Follow
us