NLM DIR Seminar Schedule

UPCOMING SEMINARS

RECENT SEMINARS

Scheduled Seminars on Jan. 13, 2026

Speaker
Qiao Jin
PI/Lab
Zhiyong Lu
Time
11 a.m.
Presentation Title
Language Modeling for Medical Calculations
Location
Hybrid
In-person: Building 38A/B2N14 NCBI Library or Meeting Link

Contact NLMDIRSeminarScheduling@mail.nih.gov with questions about this seminar.

Abstract:

Medical calculators are evidence-based tools that can provide quantitative decision support in disease diagnosis, risk stratification, treatment planning, and prognosis prediction. While large language models (LLMs) have shown state-of-the-art performance in medical question answering and clinical trial matching, their capabilities for medical calculation remain unclear. To address this, we first present benchmarks for evaluating the capabilities of LLMs for medical calculator recommendation (MedQA-Calc, AMIA 2025) and usage (MedCalc-Bench, NeurIPS 2024). We found that off-the-shelf LLMs do not possess sufficient knowledge of medical calculators and are prone to various calculation errors. To bridge this gap, we introduced AgentMD (Nat Commun 2025), an AI agent capable of curating and applying medical calculators. Using the medical literature, AgentMD curates over 2,000 medical calculators, achieving over 85% accuracy on expert quality checks and over 90% pass rates on unit testing. Augmented by these self-curated calculators and a code executor, AgentMD substantially surpasses off-the-shelf GPT-4 on risk prediction. Results on 698 real-world emergency department notes confirm that AgentMD accurately computes medical risks at the individual level. Moreover, AgentMD can provide population-level insights for institutional risk management. In summary, our trilogy work presents first-of-its-kind benchmarks and methodologies in language modeling for medical calculations, laying the research foundations in this field.