I hereby solemnly declare the field of Machine Decisioning (MD) to counter our industry’s single-minded fixation on Machine “Learning” (ML). We have been exploiting one trick (neural net) for long, time to explore: the new, the old, the unhyped…

To help you appreciate where I come from, let’s examine the systematic view of a car where the engine is the (machine “learning”) model and the rest of the car is machine decisioning.

Or look at the four quadrants of sequential decisioning space:

Do you see any notion of a model? No, Nadda. Model is but a low-level implementation detail in the grand scheme of decision making.

Impact vs Model

Models by itself cannot deliver impacts. What lies between models and impacts are the activation path of different nature and length. Machine Decisioning’s goal is to shorten and/or automate part or all of the activation path.

This essay was originally written as the nation reflects on what has gone wrong in the Texas mass shooting. Even if we have the 100% accurate mass shooter identification model, could it prevent or reduce the number of lives lost? I’m not confident.

  • The username of the identified mass shooter needs to be enriched. Who he is? Where he lives? Which car he drives or maybe driving? Privacy laws may prevent social media companies from carrying out the very enrichment process that may save lives.
  • Humans on the activation path may make the wrong decision. Even though one brave young girl called 911 multiple times proclaiming there are still students alive, the situation was still determined to be that of “barricaded suspect” rather than “active shooter”. Forty-five minutes of Police inactivity ensued from the wrong decision!
  • Communication between 911 and the police decision maker maybe broken.
  • Stakeholders are siloed. Social media companies, Police, School…
  • All decisions and actions need to be carried out in real time. Every second counts!

Models Beyond Machine “Learning”

Beyond the Machine “Learning” wonderland, there are other wonderful models whose activation path are in general shorter than those of machine “learning” models.

Inference vs Prediction

Sometimes we build models to understand the why beneath the already seen (inference). Sometimes we build models to anticipate the how of the unseen (prediction).

Prediction vs Forecasting

Considering the relative relationship between the seen and the unseen, prediction is spatial interpolation while forecasting is temporal extrapolation. Prediction normally applies to individual observations while forecasting normally for population statistics (sales, GDP…).

“Learning”

If you wonder why the quotation marks around “learning”, I honestly don’t think machine “learning” models can learn. You may call the parameters updating during model training “learning”, but the moment the training is done, those trained parameters are fixed and the trained model loses the capability to “learn”. Hence “Machine Learned Models” is probably a better name. Without proper monitoring and tune-up, the only thing a trained model does in production is, well, drifting

Model vs Method

Machine Decisioning are methods to make and effectuate decisions. While models are functional units mapping inputs to outputs, methods are computational procedures or human-in-the-loop processes. As such, models are typically used as components of machine decisioning methods. As an example, a ridge regression model maybe used to estimate the reward for the arms in multi-armed bandit.

Looking at schematic diagrams of a few machine decisioning methods, one common theme jumps out: they all have loops! The loops either optimize the overall reward across sequential decision steps or continuously improve (learn) the model over time.

High vs Low Level

Machine decisioning method operates on a higher level than machine learned models. While machine learned models can be swapped out with better or more efficient models, machine decisioning method stays relatively stable. I hope you agree with me now that true learning happens in the machine decisioning processes.

ML Engineering vs ML Modeling

AI/ML models create wonders and their impacts are only growing. But so does the hubris in the AI/ML modeling community especially toward the people that “supports” them. I’d like to argue that the engineering (ML Engineering) is as important as, if not more important than the science (ML Modeling).

ML Engineering is more than the last mile

The relationship between ML modeling and ML Engineering is not that of chicken and egg (50-50) or last mile (90-10?), rather it’s the engine vs the rest of the system. I leave it to you to judge which contributes more in generating impacts.

ML Engineering is more important than ML Modeling

If you’re still not convinced about the importance of ML engineering, hear me out:

  • A car without the engine is still useful but engine without the car is useless. After all, push carts has been used before the era of automobile and are still used widely today. The ingenious invention of wheels provides a more efficient way of moving than walking and the platform provides more capacity than holding with hands.

    A multi-armed-bandits (MAB) recommender system which relies on no fancy “wide and deep learning” model but a simple heuristic of “90% of the time recommend the best performing arm so far and 10% of the time recommend a random arm” is still very useful: It enables the digitization of channels. It allows the automation of items distribution. It offers varied experience by the “exploitation” vs “exploration” alternation.
  • ML Modeling could be automated away by ML Engineering. With automatically collected ground truth and model experimentation, manual model training would not be necessary in some use cases. When that day comes, we will not be talking about “models”, rather “Model Factories” which automatically churns out thousands of highly localized and personalized models for each combination of customer, channel, merchandise and geographical regions; With active learning process in place, we don’t need a initially well trained model, rather, start with a random monkey model and let the active learning does the magic and continuously improve it… Can’t wait!

ML Modeling is the new hardware

While ML Engineering is the software. It’s the software that are eating the world, not hardware. There must be days in the past where hardware was the cutting edge and I feel that the deep learning folks nowadays are turning their trade into hardware:

If I remove the caption of the top diagram, can you tell they are logical gates or the LSTM and GRU units widely used in recurrent neural network(RNN)?

For those of you who are pondering your career path, please keep below parallels in mind:

Also keep in mind what happened to hardware such as network routers (answer: software implemented routers). Since deep learning is already software implemented brains, I can’t imagine what software will eventually replace it (software implemented “hardware”). Maybe symbolic AI (with deep learning being computational AI)?

Summary

By coining the term “machine decisioning” which is but a fancier name for “machine learning engineering”, I hope we as an industry look beyond machine learning models by:

  • Looking around, into other areas such as time series analysis in forecasting and linear/mixed integer optimizations in operations research.
  • Looking up, into higher level methods that adapts to new observations and continuously learns.
  • Looking back, at the body of meta-heuristics we tried before. Genetic algorithms, simulated annealing, GRASP… The new tricks we learned today (embeddings, attention…) may rejuvenate the old heuristics.
  • Not looking down, on the “plumbing” (a.k.a. engineering) around the models which activates the model’s impacts. Those pipelines that automatically backfills missing data. That business rule system which enables human to filter down the items to be ranked by the recommender. The event driven architecture that reflexes and responds in realtime … They are as vital as the machine learned model!

All in all, do focus on the system, rather than a model!

References:

  1. StackExchange Q&A on difference between inference and prediction
  2. Active learning literature survey, Burr Settles
  3. Online learning, a comprehensive survey
Machine Decisioning