NLM DIR Seminar Schedule

UPCOMING SEMINARS

RECENT SEMINARS

Scheduled Seminars on Dec. 16, 2025

Speaker
Sarvesh Soni
PI/Lab
Dina Demner-Fushman
Time
11 a.m.
Presentation Title
ArchEHR-QA: A Dataset and Shared Task for Grounded Question Answering from Electronic Health Records
Location
Hybrid
In-person: Building 38A/B2N14 NCBI Library or Meeting Link

Contact NLMDIRSeminarScheduling@mail.nih.gov with questions about this seminar.

Abstract:

Drafting responses to patient questions, many of which are about their own medical records, is a major and growing source of clinician burden. Yet most question answering (QA) research largely focuses on clinician information needs or relies on general health resources, and rarely links answers back to specific evidence in the electronic health record (EHR). In this talk, I will present ArchEHR-QA, a novel benchmark dataset designed to study grounded, patient-specific QA from EHRs. The dataset aligns real patient questions from public forums with discharge summaries from MIMIC‑III/IV clinical databases. Each of 134 cases includes a patient question, a clinician‑interpreted question, a curated note excerpt with sentence‑level relevance labels, and a clinician-authored answer that explicitly cites supporting sentences, along with clinical specialty tags.

I will then give an overview of the ArchEHR-QA 2025 shared task, hosted at the ACL 2025 BioNLP Workshop. Participants submitted systems to generate text answers with explicit citations to specific note sentences given the patient question, clinician question, and note excerpt. Our evaluation framework measured both factuality (correct citation of clinical evidence) and relevance (answer quality). We received 75 system submissions from 29 international teams, spanning retrieval‑augmented pipelines, prompt‑only large language models, and adapted models. I will summarize common modeling strategies and discuss implications for using LLMs to draft responses to patient questions.