Retrieval-augmented generation for generative artificial intelligence in health care
Brown, T. et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 33, 1877–1901 (2020).
OpenAI et al. GPT-4 Technical Report. (2023).
Touvron, H. et al. LLaMA: Open and Efficient Foundation Language Models. (2023).
Touvron, H. et al. Llama 2: Open Foundation and Fine-Tuned Chat Models. (2023).
Website. https://openai.com/index/dall-e-3/.
Website. https://openai.com/index/sora/.
Thirunavukarasu, A. J. et al. Large language models in medicine. Nat. Med. 29, 1930–1940 (2023).
Google Scholar
Yang, R. et al. Large language models in health care: development, applications, and challenges. Health Care Science 2, 255–263 (2023).
Google Scholar
Roberts, K. Large language models for reducing clinicians’ documentation burden. Nat. Med. 30, 942–943 (2024).
Google Scholar
Chen, S. et al. The effect of using a large language model to respond to patient messages. The Lancet Digit. Health 6, e379–e381 (2024).
Google Scholar
Chen, S. et al. Cross-Care: assessing the healthcare implications of pre-training data on language model bias. Preprint at arXiv. (2024).
Omiye, J. A., Lester, J. C., Spichak, S., Rotemberg, V. & Daneshjou, R. Large language models propagate race-based medicine. npj Digit. Med. 6, 1–4 (2023).
Google Scholar
Wan, Y. et al. Survey of bias in Text-to-Image generation: definition, evaluation, and mitigation. Preprint at arXiv. (2024).
Yang, R. et al. KG-Rank: Enhancing large language models for medical QA with knowledge graphs and ranking techniques. Proceedings of the 23rd Workshop on Biomedical Natural Language Processing 155–166 (Association for Computational Linguistics, Stroudsburg, PA, USA, 2024).
Kirk, H. R., Vidgen, B., Röttger, P. & Hale, S. A. The benefits, risks and bounds of personalizing the alignment of large language models to individuals. Nat. Mach. Intell. 6, 383–392 (2024).
Google Scholar
Gilbert, S., Kather, J. N. & Hogan, A. Augmented non-hallucinating large language models as medical information curators. npj Digit. Med. 7, 1–5 (2024).
Google Scholar
Zakka, C. et al. Almanac—Retrieval-Augmented Language Models for Clinical Medicine. NEJM AI. (2024).
Ovadia, O., Brief, M., Mishaeli, M. & Elisha, O. Fine-tuning or retrieval? Comparing knowledge injection in LLMs. Preprint at arXiv. (2023).
Yang, R. et al. Disparities in clinical studies of AI enabled applications from a global perspective. NPJ Digit. Med. 7, 209 (2024).
Google Scholar
Ayoub, N. F. et al. Inherent bias in large language models: a random sampling analysis. Mayo Clin. Proc. Digit. Health 2, 186–191 (2024).
Haupt, S., Carcel, C. & Norton, R. Neglecting sex and gender in research is a public-health risk. Nature. (2024).
Narasimhan, M. et al. Self-care interventions for women’s health and well-being. Nat. Med. 30, 660–669 (2024).
Google Scholar
Vieira Machado, C., Araripe Ferreira, C. & de Souza Mendes Gomes, M. A. Promoting gender equity in the scientific and health workforce is essential to improve women’s health. Nat. Med. 30, 937–939 (2024).
Google Scholar
Rebbeck, T. R., Mahal, B., Maxwell, K. N., Garraway, I. P. & Yamoah, K. The distinct impacts of race and genetic ancestry on health. Nat. Med. 28, 890–893 (2022).
Google Scholar
Lewis, C. V., Huebner, J., Hripcsak, G. & Sabatello, M. Underrepresentation of blind and deaf participants in the All of Us Research Program. Nat. Med. 29, 2742–2747 (2023).
Google Scholar
Ferber, D. et al. GPT-4 for information retrieval and comparison of medical oncology guidelines. NEJM AI. (2024).
Acosta, J. N., Falcone, G. J., Rajpurkar, P. & Topol, E. J. Multimodal biomedical AI. Nat. Med. 28, 1773–1784 (2022).
Google Scholar
Llama 3.2: Revolutionizing edge AI and vision with open, customizable models. Meta AI. https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/.
Saab, K. et al. Capabilities of Gemini models in medicine. Preprint at arXiv. (2024).
Singhal, K. et al. Towards expert-level medical question answering with large language models. Preprint at arXiv. (2023).
Yang, R. et al. Ascle—a Python natural language processing toolkit for medical text generation: development and evaluation study. J. Med. Internet Res. 26, e60601 (2024).
Google Scholar
Pais, C. et al. Large language models for preventing medication direction errors in online pharmacies. Nat. Med. 30, 1574–1582 (2024).
Google Scholar
Larios Delgado, N. et al. Fast and accurate medication identification. npj Digit. Med. 2, 1–9 (2019).
Google Scholar
Liévin, V., Hother, C. E., Motzfeldt, A. G. & Winther, O. Can large language models reason about medical questions? PATTER 5, 100943 (2024).
Ke, Y. H. et al. Mitigating Cognitive Biases in Clinical Decision-Making Through Multi-Agent Conversations Using Large Language Models: Simulation Study. J Med Internet Res 26, e59439 (2024).
Krishna, S. et al. Post hoc explanations of language models can improve language models. Adv. Neural Inf. Process. Syst. 36, 65468–65483 (2023).
Zhao, H. et al. Explainability for large language models: a survey. ACM Trans. Intell. Syst. Technol. 15, 1–38 (2024).
Kresevic, S. et al. Optimization of hepatological clinical guidelines interpretation by large language models: a retrieval augmented generation-based framework. npj Digit. Med. 7, 1–9 (2024).
Google Scholar
Wu, J., Zhu, J. & Qi, Y. Medical graph RAG: towards safe medical Large Language Model via graph retrieval-augmented generation. Preprint at arXiv. (2024).
König, I. R., Fuchs, O., Hansen, G., von Mutius, E. & Kopp, M. V. What is precision medicine? Eur. Respir. J. 50, 1700391 (2017).
Liu, S. et al. Using AI-generated suggestions from ChatGPT to optimize clinical decision support. J. Am. Med. Inform. Assoc. 30, 1237–1245 (2023).
Google Scholar
Truhn, D., Eckardt, J.-N., Ferber, D. & Kather, J. N. Large language models and multimodal foundation models for precision oncology. npj Precis. Oncol. 8, 1–4 (2024).
Benary, M. et al. Leveraging large language models for decision support in personalized oncology. JAMA Netw. Open 6, e2343689–e2343689 (2023).
Google Scholar
Vargas, A. J. & Harris, C. C. Biomarker development in the precision medicine era: lung cancer as a case study. Nat. Rev. Cancer 16, 525–537 (2016).
Google Scholar
Yang, R. et al. Graphusion: a RAG framework for Knowledge Graph Construction with a global perspective. Preprint at arXiv. (2024).
Zeng, S. et al. The good and the bad: exploring privacy issues in retrieval-augmented generation (RAG). Preprint at arXiv. (2024).
Ning, Y. et al. Generative artificial intelligence and ethical considerations in health care: a scoping review and ethics checklist. Lancet Digit. Health. (2024).
link
