Before any new diagnostic or therapeutic, or technology can be used in the practice of healthcare, it has to be deemed safe and effective by certain regulatory bodies. US, EU, Japan, and others have robust requirements for such approvals. This has been the case historically for drugs and medical devices. However, there is a question now as to what should be the level of evidence required for the approval of medical algorithms? After-all, unlike drugs and medical devices, these algorithms change over time as they are exposed to more data and feedback to their output. Since there is ongoing change to the algorithms, how often should they be re-submitted for review and approval? Should you require large-scale, real-world trials to approve these algorithms, similar to the biopharma or medical devices?
So far, the regulatory bodies such as the FDA have required only smaller, retrospective studies that show the algorithm works as promised in the narrow area that it is trained in and that the results are close to the accuracy of humans. This allows those developers to start marketing their algorithms to the clinicians. It does not mean the insurance companies would pay for it (most will not since there is no evidence that it improves patient outcomes in the real-world setting yet or that it lowers cost of care) or that clinicians would consider that evidence to be sufficient for them to buy and use it.
This is both a driver and a barrier. By allowing clearance without requiring large-scale trials to show real-world efficacy, they’re allowing models to be launched, observed, and improved in the real-world setting. But, this is also serving as a barrier as the medical community does not necessarily see the FDA approval as the key signal to start using a model, since the evidence required is far below what is normally required to start using a new innovation in practice.
In my discussions with the experts, there is a wide range of opinions on this approach. They have indicated that the FDA is looking at safety and effectiveness and it will not require patient outcomes. This would allow the medical community to start using these technologies and generate evidence for their use. This approach makes sense for the type of technology that will change over time with use so creating high barriers for the initial approval when the technology is not even going to have the same level of efficacy or safety over time does not make sense. Others indicated that the FDA’s approach will have negative long-term consequences for these technologies. He indicated that the FDA’s approach does not lead to establishing high standards for AI-based products and this can lead to issues that will slow the adoption of these technologies.
For example, given the low evidence required to approve these technologies, organizations will each make their own decision if the specific product has enough evidence for them to use it. Given this approach, each health system and its experts will make their own decisions about each technology and its level of evidence. This can actually lead to a more fragmented uptake and slower long-term adoption. Post-marketing surveillance of these technologies will be absolutely necessary since the models approved may not be trained on enough data or appropriate diversity of data to ensure real-world optimal performance. Also, Payers will also review this on a case by case basis given that the FDA clearance does not necessarily give universal confidence to all stakeholders.