Toyota, SOUP, and Medical Device Development
Toyota’s recent $1.5 million jury awarded loss in an Oklahoma court illustrates it’s pretty much impossible to write software without some third party code creeping in. IEC62304:2006 calls this “Software of Unknown Provenance” or SOUP, referring to software with unknown safety-related characteristics, or developed under an unknown methodology. I know the situation very well. Operating Systems, code libraries supporting the CPU, or even artifacts created by the compiler have all led to chunks of code in my medical device applications that I didn’t write and don’t know for certain are safe.
How to avoid this unknown code? Here’s what I used to do: write my firmware from scratch in assembly language. Even then I relied on the assembler to correctly map to machine code and the processor to be bug-free. I was so close to the metal it’s unlikely there were any hidden surprises, at least by the time I tested and debugged.
Although I once prided myself in writing clear and well-structured assembly, I don’t expect I’ll ever do that again. There’s a huge advantage writing in a higher-level language and the market expects a level of sophistication in user interfaces and connectivity that’s not feasible to code from scratch.
So, how do I prove to myself and to the regulatory bodies my code is safe when I only write a fraction of it?
Some say using a commercial set of tools and libraries helps; others feel widely used and tested open-source solutions are the best answer. Either way my code base is largely developed by someone else and I feel like I’m relying a bit too heavily on faith. Neither is the answer.
What can I do to ensure my code is safe? First, let’s define safe. For this discussion it’s code which has a high probability of correctly performing risk-of-harm mitigations. This assumes I’ve done a good job of identifying potential hazards. Using that definition, I’ll rephrase the question.
How can I ensure that the probability of failure of my hazard mitigations is exceedingly low? Here are some strategies to gain confidence that my mitigations will not fail:
If I can, I lock down the particular version of development environment and all the libraries I have chosen. Don’t underestimate the chances of bugs creeping in with a new version of compiler or library. If I’ve spent the bulk of my development time with a particular tool chain; I’ve also been building confidence and experience with it. Benefitting from the fruit of all that testing is one way to reduce development time and cost.
I protect the scope and priority of critical variables and methods. I encapsulate that code in its own process or thread if appropriate. I keep intermediate and state variables private so they can’t be interfered with by other code. If variables contain particularly important data, then redundant storage and error checking may be good strategies. If necessary, I block interrupts and threads during critical code sections. I use watchdog timers to ensure code is serviced sufficiently frequently. One option is to run code on its own processor. It’s important to clearly define critical code as separate software items in the architecture: this allows a reduced level of testing for non-critical code and helps create the isolation required.
I trap for all known edge cases. This includes the obvious steps of bounds-checking arrays and protecting variables against overflow and underflow. Consider adding data quality metrics and defining what actions to take when metrics don’t meet their passing thresholds. Be aware that the quality checks then become part of your mitigation and must be verified themselves. Also keep in mind that it may not be failure of a particular computation that leads to it producing incorrect result; it could be another thread, interrupt, process or the OS itself – if you’re using one – messing with memory or CPU time. This becomes even more likely as you push your processor to the limits of memory or speed.
Ideally, I do tons of testing. The whole point is to adequately measure the probability that a mitigation will fail and show it to be sufficiently low. A goal is to test the complete set of possible inputs. Frequently this is not possible due to the vast extent of an input dataset, such as found in ultrasound systems or any technology involving the solution of ill-posed problems (in a mathematical sense). Here I am forced to rely on testing as wide a swath of the input space as I can, then hope I haven’t left any significant dirty corners untested. I often do this with synthetic data; however, this data is only as good as my understanding of how the real data varies. I also use plenty of real datasets to try and catch any gaps. Performing this testing at a unit-test level, although convenient, often does not test the code in its natural habitat. Still, with careful thought and planning this may suffice. Another option is to build the test vectors into the final program.
Ultimately – and this adds a bit of a paradox to the whole process – sometimes a better, safer solution is less mitigations. Every additional line of code I add while attempting to mitigate all the known and foreseeable risks, is another line of code to debug and de-risk. This not only adds more opportunity for errors but also distracts me from ensuring the rest of the code is risk free. If the probability of a risk leading to harm is very low, then it may be better to pay attention to those that have a higher probability instead.
These are a few of the strategies that work and that I use regularly depending on the circumstances. I would be interested to hear about others used by readers.
Kenneth MacCallum, PEng, is a Principal Engineering Physicist at Starfish Medical. He works on Medical Device Development and prefers his SOUP as a meal instead of a medical device component.
Image: Carlogos.org