Machine Learning Ubiquity and Our Surprisingly Modelable Universe
Mike Procopio, Senior Staff Software Engineer, Google
The year is 2020. AI is all the rage and data is the new oil. Recent machine learning-based results are not only compelling, but have also shown us the expansive range of applied machine learning (ML) methods: synthesized natural language models, speech recognition, game playing, emotive art and music, business systems, insurance, and of course, self-driving cars (to name just a few).
As a Senior Staff Software Engineer at Google and a Gradient Ventures Resident, I’m passionate about using machine intelligence to create positive user experiences and impact for good. My role permits me to participate in and witness many of the advances happening in the AI space, with a special focus on practical, applied ML in production systems learned from my time at Google.
This blog series maps out the approach and general considerations for deploying practical ML solutions in real-world, production applications. But before we get there, let’s first explore the reasons behind ML’s recent success and impact across so many domains.
Four key trends: Data, Algorithms, Compute, and Libraries
We’re seeing four key trends aligning in the current ML landscape which together drive the increasing ubiquity and success of ML systems:
Data. Data is required for successful ML models. Over time it has become increasingly structured, standardized, and accessible through standard APIs and data repositories. The historical problem of manual data labels (annotations) has been addressed through robust tooling and crowd-sourced hand-labelling pipelines, with innovations in data management and pre-labeling (data bootstrapping). When labeled/annotated data remains scarce or expensive, new methods can yield sufficient performance with fewer training examples.
Algorithms. Our universe is surprisingly modelable (discussed more below). ML algorithms—in particular, deep neural networks—appear particularly well suited to modeling the compositional and symmetric nature of much of our universe and the systems we design. Critical advances in the training phase for neural networks allow compelling predictive performance when paired with large amounts of data and computational power.
Computational availability (“Compute”). Large-scale data used for deep learning creates unprecedented computational demands for model training, which can now be met through cloud-enabled parallel computation. Training and evaluation methods themselves are increasingly parallelizable. Computation itself has seen massive throughput improvements through the use of special-purpose processor units such as GPUs and increasingly, TPUs (Tensor Processing Units).
ML code libraries and services. ML methods have become widely accessible through higher-level learning frameworks such as TensorFlow, PyTorch and Keras, which encapsulate and abstract learning algorithms and processes. For example, TensorFlow now natively supports rich objection detection. Increasingly, these frameworks also provide system components to more easily support automated training pipelines, data management, and serving of trained models. More than just algorithm repos, these are end-to-end solutions spanning data management, model training, and production model deployment.
We see that data, algorithms, compute, and library support have aligned to empower impactful ML capabilities. Meanwhile, we’re upleveling our skill and engineering proficiency in this space, catalyzed by the maturation of these tools, free online courses, and coursework in more formal degree programs.
All of the above would be moot if we lived in a random, difficult to model world. But of course, that is not the case: Our universe is not random, nor are the higher level behaviors we observe in it. In fact, much of our universe lends itself to being modeled in a straightforward manner. All this is to say that there is signal in the data. Where there is signal, models can be trained; and where there are trained models, predictions can be made.
Our Surprisingly Modelable Universe
As physicists have long known, our universe obeys non-random laws of physics. There are underlying models, or functions, that drive the behavior of most observable phenomena. What we generally aim to model through ML are higher level derivatives of these functions. These functions and the natural phenomena they generate frequently exhibit properties such compositionality, symmetry, and locality, which deep neural networks (DNNs) appear particularly well suited to model (and this we believe largely explains DNNs’ success on so many problems).
These fundamental functions ultimately generate higher level patterns in our day-to-day life that we’re more familiar with: how we humans drive, the ways we navigate around a mobile app, the dynamics that govern actuarial / insurance risk, the way objects appear in a 2D image as they rotate in 3D space, the documents that a user is likely to next open in a cloud storage system, etc.
Higher-Level Systems are Generally Modelable
Not only is our universe modelable, but by definition, when we create higher level systems, we generally do so intentionally in a way that is predictable. For example, when designing a mobile app for users, there are implicit rules for the user interface that are generally followed (icons, UI controls, button locations, etc.). We call these Human Interface Guidelines (example) and they are specifically designed so that previously learned models, i.e., how we have learned to use software, can apply with minimal new training data. We call such a conformant app “easy to use” and having an “intuitive UI,” which is really just to say, “this app confirms to my learned model”. And because of this underlying structure, we can train models to effectively navigate through an app towards some goal. Such models are useful in automated software testing.
In a fascinating parallel, self-driving cars are successful for the same reason: driving is modelable. Carefully crafted rules of the road (road markings, constant radius curves, slope, lane width, standardized signage, etc.) allows a driver to be successful on a new road or traffic situation because of these established rules. In other words, drivers’ previously learned models generalize to new situations (and most of us think our personal learned models are naturally superior!). On those occasions where a traffic situation deviates from known patterns, we still take action (make a prediction) and hope our model generalizes reasonably well. In time, algorithms will outperform humans here, taking more optimal actions in adverse scenarios.
If we look beyond self-driving cars, game playing, and other headline-grabbing applications, the above reasoning applies to myriad applications like industrial automation, manufacturing, server system efficiency, speech recognition, software testing, medical imaging and anomaly detection, energy modeling, battery optimization, insurance processes—the list goes on.
The above aims to illustrate why ML applies so well to so many domains: the domains themselves are naturally modelable. Many aren’t even what we would call “hard” problems. Our universe, and the systems we design, are naturally modelable.
ML Brings Value
The final consideration in understanding why ML is becoming ubiquitous is that such systems are additive, i.e., they bring value. Developing them is worth it and if done correctly can make business sense. In fact, high performing ML can often enable new capabilities or even entirely new businesses. At the same time, it can improve the user experience of an existing product or improve the business operations and efficiency of an existing business. ML helps access efficiencies.
So how do we capitalize on these trends? How do we actually go about manifesting a ML capability that benefits users and businesses?
Intelligence is hard—but not impossible
We know that doing simple tasks with ML methods is often easy, yet deploying practical, impactful ML-based methods that benefit users or a system is much harder. But it’s not magic; it can be done. To do it, traditional scientific and engineering methods must be employed. For example, the diagram below outlines the architecture of the Google Drive Quick Access file prediction system shown on the homepage of Google Drive:
System Architecture of the Drive Quick Access ML system, as presented publicly by the author at Google Cloud Next 2017.
As illustrated above, the actual learning portion of a successful ML feature is a small part of a much larger production system. The learning component is itself a system that must be developed, maintained and measured just like any other component in any other system. ML methods can be developed and deployed successfully, and the project risk managed, through careful application of classic scientific and software engineering methods.
A practical roadmap for integrating ML into production systems
Inspired by the concept that a successful ML system has many moving parts, this blog series will map out the approach and general considerations for deploying practical ML solutions in real-world, production applications. Future posts will describe a practical roadmap for doing this, proceeding generally along the roadmap below:
- AI Principles and AI Bias. Explore and adopt AI principles as they apply to the domain, and carefully evaluate potential sources of bias on your AI/ML methods.
- Understand the problem and its data, inventorying and visualizing the feature data (signals, inputs) that are available. Perform high-level experiments to gauge what signals may be useful, i.e., predictive in the domain. Start to develop an intuition for data cost, since some input features are more expensive than others and they will have to be available and maintained long-term.
- Metrics and Dashboards. Determine and implement metrics early on; get cross-functional team buy-in. Metrics should comprise business, scientific, and user metrics. Once established, create dashboards and automate reporting.
- Establish Baseline Algorithms that can be compared against. Then create an evaluation pipeline and set up and automate experiments. Begin to develop an intuition for what is “good enough” in the domain, as returns may be diminishing.
- Understand that ML is just one part in a larger production system. A successful ML integration with a real-world system is a larger task than just building a one-off model.
Next time, we’ll do a deep dive into AI principles, which are important to have front and center in mind as you embark on your ML project. In parallel we’ll explore sources of AI bias, their implications, and ways to address this in your AI system.
Thank you for reading! I welcome your feedback on my thoughts here, as well as ideas to incorporate into future posts. Feel free to reach out to me at firstname.lastname@example.org.
Until next time!
Senior Staff Software Engineer, Google
Engineering Resident, Gradient Ventures