Transforming electronic case reports with generative AI: Unlocking faster public health responses
For years, public health agencies have relied on paper-based case report forms to supplement the electronic laboratory reports (ELRs) they receive for reportable diseases. While ELRs provide positive test results, the accompanying case reports give public health agencies critical clinical, demographic, and risk factor data needed for effective disease investigation and response.
However, the sheer volume of COVID-19 cases quickly overwhelmed this manual, paper-based process. Prior to the pandemic, the Office of the National Coordinator for Health IT (ONC) and the Centers for Disease Control and Prevention (CDC) developed standards for an electronic case report (eCR) form that could be automatically sent to public health agencies from providers’ electronic health records (EHRs).
In response to the COVID-19 data crisis, the CDC launched the eCR Now initiative to accelerate eCR adoption across the country. As a result, public health agencies now receive a flood of detailed eCR data for COVID-19 and other reportable conditions. While this wealth of information is invaluable, the sheer volume overwhelms public health systems and staff.
Amazon Bedrock is a fully managed service on Amazon Web Services (AWS). Using foundation models (FMs) available on Amazon Bedrock can help public health systems manage the volume by automatically extracting the key public health actionable data from eCRs. This empowers agencies to focus their limited resources on the most critical disease investigation and response activities.
Amazon Bedrock makes FMs from leading artificial intelligence (AI) startups and AWS available through an API, allowing you to choose from a wide range of FMs to find the model best suited for your use case. With the Amazon Bedrock serverless experience, you can get started quickly, privately customize FMs with your own data, and integrate and deploy them into your applications using the AWS tools without having to manage any infrastructure.
This post will explore how Amazon Bedrock and other AWS services can strengthen public health surveillance and response, even as the volume of eCR data continues to grow.
The challenge
While eCRs offer public health agencies more comprehensive data, jurisdictions quickly find themselves inundated with information they can’t effectively process. A single reportable condition can generate multiple eCRs over time based on the trigger codes within the electronic health record (EHR) system.
Each eCR contains extensive data beyond public health needs, such as medical procedures, medications, and laboratory tests. The sheer volume and size of eCRs make it challenging for agencies to utilize the information despite CDC efforts to enable human-readable formats.
Additionally, key public health data is inconsistently formatted across eCRs, with no consistent tags or fields for concepts like travel history or symptom onset. eCRs also include irrelevant sensitive information that public health agencies do not have the authority to collect, like mental health diagnoses and related medications.
A generative AI–based approach
Rather than transforming eCR data into a standardized format, AWS proposes a different approach. Public health agencies know the key elements they need from an eCR, such as whether a hospitalized COVID-19 patient is on a vasopressor, whether a Hepatitis A case involves a food handler, or whether a syphilis case involves a person of childbearing age without a negative pregnancy test or appropriate treatment.
Instead of reading through the entire eCR to identify the data elements, the suggestion is to use generative AI. By using prompt engineering, agencies can ask specific questions and generate a table of actionable data elements to prioritize cases or import into surveillance systems.
In addition to utilizing the information within the eCR, public health agencies can incorporate Knowledge Bases for Amazon Bedrock to enhance the insights. For example, when asking about a syphilis case, the system could not only extract relevant data from the eCR but also check against a knowledge base on appropriate syphilis treatment to determine if the patient received the proper treatment for syphilis. This approach allows agencies to prioritize cases for follow-up, like persons of childbearing age with syphilis who have not had a pregnancy test or appropriate treatment.
Technical details and implementation
The proposed generative AI-based approach to streamline eCR data extraction and utilization involves the following steps, as shown in Figure 1.
- Submit the eCR document with a natural language prompt specifying the data elements to extract, such as hospitalization status or appropriate treatment, to the Anthropic Claude 3 Haiku model on Amazon Bedrock.
- The Anthropic Claude 3 Haiku offered model by Amazon Bedrock then processes the eCR and extracts the requested data elements, presenting the information in a structured format.
- The extracted data is validated against the information stored in Knowledge Bases for Amazon Bedrock, which contains relevant organizational documents and guidelines, to ensure accuracy and completeness.
- If the extracted data does not align with the knowledge base, the system will trigger a notification using Amazon Simple Notification Service (Amazon SNS) to alert the public health agency.
- The validated data elements are seamlessly integrated into the agency’s disease surveillance systems, enabling efficient case prioritization, contact tracing, and intervention implementation.
This end-to-end workflow, orchestrated using AWS Step Functions, calls to Amazon Bedrock, and knowledge bases are made using microservices on AWS Lambda. The organization’s document information is converted into embeddings and stored in Amazon OpenSearch Service vector databases, enabling efficient retrieval and validation of the extracted data. By using this workflow, public health agencies can streamline the process of extracting and utilizing critical data from eCRs.
The incorporation of knowledge bases and data validation further enhances the reliability and trustworthiness of the insights generated from eCR data, enabling agencies to focus on the specific information they need without having to parse through the vast amounts of data in each eCR document.
Note that at the time of writing this post, we are using the Anthropic Claude 3 Haiku model. However, you can apply this same solution with the latest models being released by Anthropic, as the approach is not limited to a specific model.
Watch this short video for a demonstration of how you can use Amazon Bedrock to process eCRs and extract key data elements using a simple natural language prompt.
Conclusion
While the eCR initiative holds great promise, the current challenges in extracting actionable insights from these unstructured documents are hindering the ability of public health professionals to effectively monitor, investigate, and respond to reportable diseases.
By using the power of generative AI to extract the specific data elements that public health authorities require, we have the opportunity to transform the way that eCR data is consumed and utilized. This approach not only streamlines workflows but also ensures that public health agencies can focus their limited resources on the most critical cases and interventions.
If you’re a public health leader grappling with the challenges of eCR data, we encourage you to explore how generative AI can be applied to your workflows. Learn more about how Amazon Bedrock can help with other public health use cases, such as scalable intelligent document processing, and contact us to learn more about how our solutions can help your agency unlock the full potential of electronic case reporting.
Together, we can build a more resilient and responsive public health infrastructure that is prepared to meet the evolving needs of our communities.
Read related stories on the AWS Public Sector Blog:
link