Tips to Avoid Medical Device AI Pitfalls

Roughly 80% of human brain volume is made up by the neocortex, a set of highly interconnected layers of neurons that controls our higher brain functions, such as perception, cognition, motor control, and language. This massively parallel architecture is the initial inspiration for Deep Learning – a neural network paradigm that excels at pattern recognition and classification.

Most interestingly, neural networks are universal approximators – meaning that a network with at least a single intermediary layer and an appropriate activation function can approximate any function in finite-dimensional space to any desired non-zero error, as long as it has enough neurons. This means that a neural network could theoretically reproduce anything that can be represented as mapping an input to an output. As vague as this sounds, this encompasses almost anything you currently do, including tasks ranging from facial recognition to stock market prediction, to high-speed calculation of atomic-level energetics, and many more!

Thanks to this incredible flexibility, and a boom in computing power over the past few decades, neural networks and deep learning are finding widespread use as tools of data analysis in nearly every industry, including that for medical devices. Recently, the FDA published a guidance on the use of Artificial Intelligence and Machine Learning, admitting that its traditional paradigm of medical device regulation is ill-suited for these technologies. The guidance describes a new foundation for pre-market review of devices which gain ML-driven software modifications.

Researchers and innovators can hardly be blamed for wanting to incorporate these technologies into their projects. The capabilities of such methods to deal with and analyze the reams of data now being collected are, at the bare minimum, incredibly exciting. It also can’t be understated how incorporating neural network technologies into both old and new medical devices can captivate investors, who are necessary to fund innovative development.

Unanticipated Risks

These exciting approaches come with potential risks. One of the biggest advantages of neural networks is their ability to take highly dimensional data as input and provide output in exactly as many dimensions as desired. Problems arise, however, when these networks are also expected to have the same priors about parameter importance as researchers. A bevy of ML tools have been thrown at COVID-19 data, with little success. The failings typically lie with inappropriate data collection, reduction and normalization.

While it is mathematically possible for a network to be trained to ignore irrelevant input information, in practice this is incredibly difficult, and well-meaning researchers looking to put together a tool to do what we view as a relatively simple task accidentally end up training a network to do something completely different. The first step of data analysis begins long before the first line of code has been written – when researchers decide the architecture of the network, and what inputs to provide to the model. Ironically, by taking proper steps in data reduction, one can usually implement a more classical algorithm to solve the same problem, with much lower complexity, and in a significantly more transparent manner.

Training

The next big hurdle for neural networks lies with training. There has been a cascade of corporate embarrassments brought about by accidentally neglecting appropriate racial representation in training data sets. Training a network on the entire range of possible inputs is incredibly important, and often forgotten. While these methods are amazing interpolators, if the segmentation done under-the-hood in the network doesn’t describe the true differences outside of the bounds of the training data, any extrapolations will be meaningless, at best.

Checklist

If you are considering a neural network to solve problems, here are several simple questions that will almost certainly improve your results:

If you have a team, does it include experts who can critique the results of the network? Does it include people who can critique the implementation of the network?
Can the problem be simplified by using fewer inputs?
- You’re wrong, it can.
Once the problem has been reduced appropriately, can a classical algorithm perform the analysis you wanted the NN for?
- You’re probably wrong, and there’s almost certainly something that can, but it may be non-obvious enough to make NNs an easier choice in the short term.
Is your training set complete? Does it encompass the entirety of possible data that the network might encounter?
- You guessed it – you’re probably wrong. You’ll want to quadruple-check to avoid many incredibly embarrassing allegations in pop-sci articles.
  - Does your training set include people of a different race/skin tone/eye colour than you?
  - Even if you don’t think that’s relevant, you might want to anyway.
Finally, look beyond the default “accuracy” output. What is your Recall, what is your Precision?

Conclusion

In conclusion, neural networks are beautiful, flawed things, for which the phrase “garbage in, garbage out” applies perfectly. They have the potential to provide great interpolated insights on highly dimensional data with unclear correlations, and are immensely attractive to investors, the media, and/or your boss. However, these results emerge from a black box that makes debugging incredibly difficult, and edge cases hard to predict. In order to avoid spurious, useless results that stem from incorrect training, the input data needs to be heavily vetted and pre-processed, at which point any clustering should be evident, or at least accessible to other, more classical algorithms.

Thorold Tronrud is a Software Engineer at StarFish Medical. He is completing his astrophysics PhD thesis on the use of neural networks and machine learning methods to identify accreted stars in the Milky Way disc.