What Digital Health Startup’s Need to Know About the EU AI Act
What Digital Health Startup’s Need to Know About the EU AI Act Max Mamoyco, Ilya Pavlov
On March 13, the European Union (EU) Artificial Intelligence (AI) Act was approved by an overwhelming majority of European Parliamentarians.
The act is a game-changer globally, being the first-ever law to put AI systems under the microscope. It lays down the law for AI development and usage, dishing out hefty fines and a laundry list of must-dos for outfits dabbling in AI. Interestingly, it doesn't just affect European firms but also those outside the EU if their AI systems are used in Europe.
But what does this mean for healthcare companies developing AI products?
How businesses should navigate AI compliance
Since the law came into effect, all those stories about AI mishaps are now a legal headache. Companies developing and deploying high-risk AI systems will bear a particularly heavy regulatory burden. However, regardless of the risk category, it's wise for all AI-involved companies to implement AI compliance measures.
AI compliance is an internally implemented comprehensive oversight of AI systems, ensuring they adhere to legal requirements and ethical norms throughout their development, deployment, and use.
AI compliance should encompass:
- Regulatory assessment to guarantee compliance with laws during the development, deployment, and use of AI systems. This includes not only compliance with AI laws but also, for example, adherence to General Data Protection Regulation (GDPR) rules and copyright norms.
- Ethical standards, including fairness, transparency, accountability, and respect for users' privacy. Ethical compliance involves identifying biases in AI systems, preventing privacy breaches, and minimizing other ethical risks.
The EU AI Act proposes a risk categorization
Such categorization of AI systems helps understand which requirements need to be met based on unacceptable risk, high, limited, and minimal.
Unacceptable risk: this category includes, for example, biometric identification and patient categorization, social rating systems, or even voice-enabled applications encouraging risky behavior. It's worth double-checking your model to ensure it doesn't suggest trying any inappropriate exercises for the patient, which may not consider secondary health factors like hernias.
High risk: This could involve medical devices, applications addressing chronic issues, or offering treatment and robotic surgery.
Requirements for providers of high-risk AI systems
A: High-risk AI providers must establish a risk management system throughout the high-risk AI system’s lifecycle. For instance, in the development phase of a medical diagnostic AI system, potential risks could include misdiagnosis leading to patient harm. To address this, developers could implement rigorous testing procedures, conduct extensive clinical trials, and collaborate with medical professionals to ensure the accuracy and reliability of the AI system.
Throughout deployment and operation, continuous monitoring and feedback loops could be established to promptly detect and address any emerging risks or performance issues.
B: Providers must ensure data governance by verifying that training, validation, and testing datasets are relevant, sufficiently representative, and — to the best extent possible — free of errors and complete for their intended purpose.
For example, in the development of a digital health app designed to diagnose skin conditions using AI, ensuring data governance involves verifying that the training, validation, and testing datasets include diverse and comprehensive images of various skin conditions across different demographics.
The datasets should be sourced from reputable sources, such as medical databases or clinics, to ensure accuracy and relevance. Additionally, the datasets should be carefully curated to eliminate errors and inconsistencies, such as mislabeled images or poor image quality, which could affect the AI's ability to effectively diagnose skin conditions. This rigorous data governance process helps to improve the app's accuracy and reliability in providing diagnostic recommendations to users.
C: They must develop technical documentation to demonstrate compliance and provide authorities with the necessary information for assessing that compliance.
The technical documentation could include:
- System architecture: Detailed diagrams and descriptions of the app's infrastructure, including how user data is collected, processed, and stored securely.
- Data handling policies: Documentation outlining how user data is handled throughout the app's lifecycle, including data collection methods, encryption protocols, and data retention policies.
- AI algorithms: Descriptions of the AI algorithms used in the app to analyze user data and generate exercise recommendations, including information on how the algorithms were trained and validated.
- Privacy and security measures: Documentation detailing the app's privacy and security features, such as user consent mechanisms, access controls, and measures to prevent unauthorized access or data breaches.
- Compliance with regulations: Evidence of compliance with relevant regulations and standards, such as Health Insurance Portability and Accountability Act (HIPAA) in the United States or General Data Protection Regulation (GDPR) in the EU, including any certifications or audits conducted to verify compliance.
D: High-risk AI system providers must incorporate record-keeping capabilities into the high-risk AI system to automatically log relevant events for identifying national-level risks and substantial modifications throughout the system’s lifecycle.
To incorporate record-keeping capabilities, the system could automatically log various events, such as:
- Patient interactions: Recording each instance where the AI system provides a diagnosis or recommendation based on patient input or medical data.
- System updates: Logging any updates or modifications made to the AI algorithms or software to improve performance or address issues.
- Diagnostic outcomes: Documenting the outcomes of each diagnosis provided by the AI system, including whether the diagnosis was accurate or if further testing or intervention was required.
- Adverse events: Noting any instances where the AI system's diagnosis or recommendation led to adverse outcomes or patient harm.
- System performance: Keeping track of the AI system's performance metrics, such as accuracy rates, false positive/negative rates, and response times, over time.
E: Provide deployers downstream with clear instructions for use to facilitate their compliance.
For instance, if the AI system is designed to assist radiologists in interpreting medical imaging scans, the instructions for use might include:
- Step-by-step procedures for accessing and logging into the AI system securely, including authentication requirements and user access levels based on roles and responsibilities.
- Guidelines for inputting patient data and uploading medical imaging scans into the AI system's interface, ensuring compliance with data privacy regulations such as Health Insurance Portability and Accountability Act (HIPAA) or General Data Protection Regulation (GDPR).
- Instructions on how to interpret and validate the AI system's output, including criteria for assessing the system's confidence level in its predictions and recommendations.
- Protocols for documenting the use of the AI system in patient medical records, including any relevant findings, recommendations, or alerts generated by the system.
- Recommendations for integrating the AI system into existing clinical workflows and decision-making processes, minimizing disruptions, and ensuring seamless collaboration between healthcare professionals and the AI technology.
- Training resources and materials to educate healthcare providers on the capabilities, limitations, and potential risks associated with using the AI system, emphasizing the importance of ongoing education and skill development.
F: Implement robust mechanisms for human oversight and intervention to ensure that AI systems do not replace human judgment in making critical medical decisions.
For instance, consider a high-risk AI system designed to assist healthcare providers in diagnosing skin cancer based on dermatology images. To ensure human oversight and intervention, the following mechanisms could be implemented:
- Decision support alerts: The AI system could be programmed to flag cases where its diagnostic confidence falls below a certain threshold or where the diagnosis is inconclusive. In such cases, the system would prompt the healthcare provider to review the AI's findings and exercise their clinical judgment.
- Second opinion review: The AI system could offer the option for healthcare providers to request a second opinion from a human specialist or a panel of experts in cases of uncertainty or disagreement between the AI's diagnosis and the provider's initial assessment.
- Audit trails and logging: The AI system could maintain detailed audit trails of its decision-making process, including the rationale behind each diagnostic recommendation, the input data used for analysis, and any adjustments made by human reviewers. This information would be logged for review and verification by healthcare professionals.
- Emergency override functionality: In urgent or life-threatening situations where immediate action is required, the AI system could include an emergency override function that allows healthcare providers to bypass the AI's recommendations and make decisions based on their clinical judgment.
- Continuous monitoring and feedback: The AI system could incorporate mechanisms for ongoing monitoring and feedback, where healthcare providers can report discrepancies, errors, or adverse outcomes encountered during the use of the system. This feedback loop would facilitate continuous improvement and refinement of the AI's algorithms.
G: Design the high-risk AI system to achieve appropriate levels of accuracy, robustness, and cybersecurity.
In the context of developing a high-risk AI system for diagnosing cardiovascular diseases in digital health applications, achieving appropriate levels of accuracy, robustness, and cybersecurity is paramount to ensure patient safety and data integrity. Here's how this could be implemented:
- Accuracy: The AI system should undergo rigorous validation and testing processes using diverse and representative datasets of cardiac images, patient records, and clinical outcomes. Continuous refinement and optimization of the AI algorithms should be conducted to improve diagnostic accuracy over time. For example, the system could achieve a high level of accuracy by leveraging deep learning techniques trained on a large dataset of echocardiograms, electrocardiograms, and other cardiac imaging modalities.
- Robustness: The AI system should be designed to perform reliably under various real-world conditions and scenarios, including differences in patient demographics, imaging quality, and disease manifestations. Robustness can be achieved by incorporating techniques such as data augmentation, model assembling, and adversarial training to enhance the system's resilience to noise, artifacts, and uncertainties in input data. Additionally, the system should include fail-safe mechanisms and error handling procedures to mitigate the impact of unexpected failures or malfunctions.
- Cybersecurity: Protecting patient data and ensuring the confidentiality, integrity, and availability of healthcare information is critical for the safe and secure operation of an AI system. Robust cybersecurity measures should be implemented to safeguard against unauthorized access, data breaches, and cyber threats. This may include encryption of sensitive data both at rest and in transit, implementation of access controls and authentication mechanisms, regular security audits and penetration testing.
H: Establish a quality management system to ensure adherence to regulatory requirements.
- Document control: Manage all development documents for accuracy and accessibility.
- Change management: Rigorously assess and approve system changes to maintain safety and efficacy.
- Risk management: Identify, assess, and mitigate risks throughout the system's lifecycle.
- Training: Provide staff training on software development, quality principles, and regulations.
- Audits: Conduct regular internal and external audits to ensure compliance and continuous improvement.
An example of how the final action plan might look.
Limited risk requirements
The use of AI is considered limited risk if transparency requirements are met. For example, when using chatbots, it will now be necessary to disclose that the person is interacting with a machine, not a real nurse.
Content created by AI must be appropriately labeled as artificially generated. The same requirement applies to audio and video content created using deepfakes. Therefore, simply indicate to the user in any available way that they are currently interacting with AI.
Minimal or absent risk requirements
The law allows for the free use of AI with minimal risk. This includes, for example, video games with AI support (ie, gamification not related to treatment in the app) or spam filters. Minimal risk is unregulated.
Additional responsibilities are specifically outlined for General Purpose AI (GPAI) providers. GPAI is an AI model, including self-governing, trained on a large amount of data, sufficiently versatile, and capable of competently performing a wide range of tasks. The most well-known example is ChatGPT.
How does GPAI differ from regular AI?
Example of an app using regular AI: An X-ray diagnostic application assists doctors in analyzing X-ray images, helping to detect certain pathologies or diseases such as lung cancer or osteoarthritis. It uses conventional machine learning models, such as convolutional neural networks, to train for anomaly detection in X-rays.
Example of an app using GPAI: A health and wellness management application provides personalized recommendations for a healthy lifestyle and helps manage chronic conditions. It utilizes a GPAI model to analyze user data, such as medical history, physical activity, sleep, and nutrition indicators, as well as user feedback, to offer individualized recommendations and support health and well-being.
Both of these examples demonstrate the application of artificial intelligence in healthcare, but the GPAI model application offers greater flexibility and the ability to provide personalized recommendations and support.
Put simply, traditional AI is adept at recognizing patterns, whereas generative AI shines in creating patterns. While traditional AI can analyze data and report its findings, generative AI can leverage the same data to generate entirely novel outputs.
What requirements come with the use of GPAI?
- All GPAI model providers must furnish technical documentation, usage instructions for the models, comply with copyright laws, and publicly describe the content used to train the AI.
- Providers of GPAI models with open licenses must adhere to copyright laws and publish summarized training data unless they pose systemic risks.
- All providers of GPAI models, whether open or closed, posing systemic risks must also conduct model assessments, monitor, and report serious incidents involving AI models, and ensure cybersecurity protection.
Failure to comply with the prescribed legal norms may result in hefty fines of up to €35 million or 7% of global turnover.
Conclusion
To implement a risk-oriented approach, the law establishes the following rules:
- Prohibits AI systems that pose unacceptable risks. Such systems may only be permitted for use in exceptional circumstances, for law enforcement purposes by court order (eg, real-time facial recognition to search for a missing child).
- Identifies a list of AI systems with high levels of risk and sets clear requirements for such systems and the companies that develop and deploy them.
- Requires compliance assessments before introducing AI systems with high levels of risk into operation or bringing them to market.
- Specifies transparency requirements for AI systems with limited risk and completely deregulates AI with minimal risk.