Engineering/technical debt
The latest techniques in AI making use of deep neural networks have reached amazing performance in the last five to seven years. However, the tooling and infrastructure needed to support these techniques are still immature, and few people have the necessary technical competence to deal with the whole range of data and software engineering issues. Especially in medicine, AI solutions will often face problems related to limited data and variable data quality. Predictive models will need to be re-trained when new data comes in, keeping a close eye on changes in data-generation practices and other real-world issues that may cause the data distributions to drift over time. If several data sources are used to train models, additional types of “data dependencies,” which are seldom documented or explicitly handled, are introduced.
In medical applications, transfer learning — using a pre-trained model and adapting it to one’s specific use case — is often applied, but then a “model dependency” is introduced where the underlying model may need to be retrained or change its configuration over time. The large amount of “glue code” typically needed to hold together an AI solution, together with potential model and data dependencies, makes it very difficult to perform integration tests on the whole system and make sure that the solution is working properly at any given time.
An operational AI platform such as the one we are building at Peltarion, handling the entire modeling process including software dependencies, data and experiment versioning as well as deployment, has the potential to solve many of these engineering and technical debt issues.