In the previous two articles, I outlined the two phases of adopting machine and deep learning. In the figure below, we show the steps that companies typically evolve through while adopting artificial intelligence (AI), machine learning (ML) and deep learning (DL) solutions. As shown in the figure, the third step, the subject of this article, is where machine learning and deep learning components are deployed for real – in products, solutions and services. Although this is where customers start to experience the real benefits of AI in your product, it is also where problems might come back to haunt you.
Whenever companies start to deploy a new technology, there initially is trepidation and everyone is typically concerned with ensuring that the technology works well before deployment. The challenge is that it is virtually impossible to validate pre-deployment, and be confident that all edge cases and combinations are accounted for and will not lead to unwanted outcomes.
The validation problem is exacerbated by what cognitive data scientist Daniel Dennett refers to as the post-intelligent design era. Traditionally, every system put in the hands of users has been designed by a product designer, or a team of designers, that architected the system to do precisely what it was designed to do. In this case, humans provide the “intelligent design,” and when things go wrong, the initial design team investigates the system to figure out why it didn’t work as planned.
In the case of machine and deep learning, we build the models, but we don’t really know why they generate the results they do. These systems are the result of evolving processes, active during model training, that cause the models to gravitate to specific configurations and weights. Not understanding how a machine learning/deep learning model works is inherent to the situation as humans have not been able to develop systems that generate satisfactory results, such as image recognition, speech recognition, etc. However, the very fact that we don't really know why a system works may be at the root of situations where a model suddenly behaves very different than what we expected.
The fact that we don't really know why a system works may be at the root of situations where a model suddenly behaves different than expected
The cautionary principle would then dictate that we should not use these machine learning/deep learning models in critical deployments. However, I feel that this is an overly simplistic viewpoint in that these models have the potential to provide enormous benefit to mankind. The scope of repetitive and tedious tasks now conducted by humans can finally be automated by the use of AI, which is powerful. Although many are concerned about the potential for high levels of unemployment of low-skilled workers, I recognize the need for a transition period in society, as history has shown us that every wave of automation has created a broad range of new jobs that consist of far more exciting and intellectually stimulating tasks.
When deploying machine learning/deep learning models in parts of our systems where performance is critical to the success of the system, our research has shown that companies experience major engineering challenges. You can find more details in the papers referenced below and an earlier article, but a high-level summary of some of the challenges that the companies in our case studies have experienced include:
- Training data of sufficient quantity and quality to increase confidence in trained models before deployment
- Scaling machine learning/deep learning deployments in systems with high volumes of users, transactions or events
- Reproducing results in general, ranging from training to consistency of results of multiple model versions
- Debugging deep learning models is challenging, as it is generally hard to pinpoint where problems originate
- Ensuring non-functional requirements such as latency and throughput prove to be difficult to achieve
- Rapidly evolving GPU architectures and deep learning libraries make it difficult to maintain machine and deep learning models once developed
- Monitoring and logging of model behavior is challenging, especially when it comes to recognizing the differences between bugs and features
- Unintended feedback loops between the system and its users can easily be created when users adjust their behavior to the system, the system learns the new behavior, users again adapt, etc.
Machine and deep learning represent one of the most existing and beneficial technologies for mankind in our current day and age. These systems are exponents of a “post-intelligent design” era and require us to learn to compensate for the specifics of these systems, especially when deploying in critical contexts. In addition, companies experience significant engineering challenges when deploying machine learning/deep learning. There are, however, innovative companies that provide solutions to many of these challenges, such as Peltarion. Mankind, industry, your company and you personally already greatly benefit from AI/machine learning/deep learning and we’ve barely scratched the surface of the potential of these technologies.
Let’s make sure we capitalize on this enormous opportunity!
This article was originally published on janbosch.com
- 01/ Lucy Ellen Lwakatare, Aiswarya Raj, Jan Bosch, Helena Holmström Olsson and Ivica Crnkovic, "A taxonomy of software engineering challenges for machine learning systems: An empirical investigation" — XP 2019 (forthcoming), 2019
- 02/ Anders Arpteg, Björn Brinne, Luka Crnkovic-Friis, and Jan Bosch, “Software Engineering Challenges of Deep Learning” — 44th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), pp. 50–59. IEEE, 2018