Hero Background

NLP-AI4Health: Second Workshop on Integrating NLP and AI for Multilingual and Patient-Centric Healthcare Communication

On 23rd December, 2025 at IJCNLP AACL

Venue - Mumbai, India (20th-24th December, 2025)

About NLP-AI4Health 2025

NLP-AI4Health 2025 is a focused workshop on how Natural Language Processing (NLP) and Artificial Intelligence (AI) can improve multilingual and patient-centered healthcare communication. Organized by CMC Vellore and IIIT-Hyderabad (LTRC), the workshop provides a platform to advance inclusive, accessible, and culturally aware language technologies for healthcare. We invite research contributions on topics such as low-resource medical NLP, simplification of clinical documents, speech-based tools for Electronic Health Records (EHRs), and ethically aligned, culturally sensitive AI systems. A key feature of this year’s workshop is a shared task on Multilingual Health Question Answering, which encourages the development of inclusive and robust QA systems across Indian languages. Held alongside IJCNLP-AACL 2025, the event will include keynote talks, paper presentations, and collaborative discussions, bringing together researchers, clinicians, developers, and policy experts to co-create practical and impactful language solutions for healthcare in diverse linguistic settings.

Objectives

The primary goal of this workshop is to explore how advances in Natural Language Processing (NLP), Computational Linguistics (CL), and Artificial Intelligence (AI) can be leveraged to address communication challenges in multilingual and resource-constrained healthcare environments.

Understandcommunication challenges in multilingual and low-resource healthcare settings.

Buildtools to simplify and translate patient documents like consent forms and discharge summaries.

Improvespeech-based systems for clinical documentation and multilingual EHR entry.

Promotecollaboration between healthcare professionals and language technology researchers.

Organizea shared task on health question answering in Indian languages to support low resource NLP research.

Call for Papers

We invite submissions of original research papers, position papers, and system demonstrations that address challenges at the intersection of language technologies and healthcare communication, particularly in multilingual and low-resource settings. Submissions may describe completed work, ongoing projects, preliminary results, or innovative ideas relevant to the following (but not limited to) topics:

1.

NLP for multilingual and low-resource healthcare applications

2.

Translation and simplification of patient documents (e.g., consent forms, summaries)

3.

Speech and voice technologies for clinical data capture

4.

Language model development for underrepresented languages in health domains

5.

Adapting and fine-tuning LLMs for healthcare-specific use

6.

Interfaces for patients with low literacy or special accessibility needs

7.

NLP tools that support culturally sensitive patient care and education

8.

Ethical and equitable AI use in healthcare language technologies

9.

Real-world case studies on multilingual healthcare NLP deployments

10.

Shared task systems, datasets, and evaluation methodologies

Workshop Timeline

Circle Arrow

First Call for Papers

July 22, 2025

Circle Arrow

Second Call for Papers

August 22, 2025

Circle Arrow

Third Call for Papers

September 22, 2025

Circle Arrow

Submission Deadline

September 29, 2025

Circle Arrow

ARR Commitment Deadline

October 27, 2025

Circle Arrow

Notification of Acceptance

November 3, 2025

Circle Arrow

Camera-ready Papers Due

November 11, 2025

Circle Arrow

Proceedings Due

December 1, 2025

Circle Arrow

Workshop

December 23, 2025

Submission Guidelines

Please use the ACL 2025 style template for your submission. The submission should be anonymized for double-blind review.

Page limits:

  • • 4 pages for short papers (excluding references and appendices)
  • • 8 pages for full papers (excluding references and appendices)

Accepted papers will be published in the ACL Anthology as part of the IJCNLP-AACL 2025 workshop proceedings.

Shared Task on Patient-Centric Question Answering

Multilingual Health Question Answering for Head and Neck Cancer and Cystic Fibrosis

Task Overview

This shared task challenges participants to build models that can generate concise summaries and answer patient-centric questions based on natural, multi-turn dialogues related to Head and Neck Cancer (HNC) and Cystic Fibrosis. The dialogues are real-world conversations between patients and healthcare providers, focusing on providing reliable, understandable information to patients and caregivers.

Sample Dataset


The NLP4Health Dataset is part of the Shared Task on Patient-Centric Question Answering. Key features include:

  • Organized by patient scenarios with multiple consultation instances per case
  • Multiple file formats including conversation transcripts, structured summaries, and QA pairs
  • Two participation tracks: Closed Task (provided data only) or Open Task (external resources allowed)

Data Description

Training Set:

  • 50K validated dialogues in English, Hindi, Telugu, Tamil, Bangla, Gujarati, Kannada and Dogri on HNC and Cystic Fibrosis
  • Synthetically generated dialogues validated by humans for medical accuracy and cultural appropriateness
  • Each dialogue includes:
    • A multi-turn conversation between patients and healthcare providers
    • A summary capturing the main information points
    • 4-5 question-answer pairs derived from the dialogue

Test Set:

  • 5K unseen dialogues for which participants must generate summaries and answers to new questions
  • Gold summaries and answers validated by medical experts will be used for evaluation

Task Objectives

Participants are asked to build models with fewer than 3 billion parameters that can:

  • Generate informative and accurate summaries of multi-turn healthcare dialogues
  • Answer patient-centric questions based on the dialogue content with high factual correctness and clarity
  • Support cross-lingual input-output: The model should be capable of taking input dialogues in one language and generating summaries and answers in a different requested target language from the supported set
  • Participants may use any available external data for training (open task) or must strictly use the provided training data (closed task)

Evaluation

Models will be evaluated on:

Automatic Metrics:

  • ROUGE, BLEU, BERTScore for summaries
  • Exact Match (EM), F1 score for QA pairs

Human Evaluation:

  • Medical experts will assess factual accuracy, completeness, and helpfulness of generated summaries and answers
  • Evaluation will be conducted for question answering tasks for English and translated test data

Submission Guidelines

Participants should submit:

  • Generated summaries for all test dialogues
  • Answers to all test questions
  • Models must have fewer than 3B parameters, and participants should also provide a resource link (e.g., GitHub or Hugging Face) for the model
  • Submit results via the shared task website before the deadline in codabench

Data Usage and Access

  • The training, validation, and test data will be released in standard JSON format containing dialogues, summaries, and QA pairs
  • The data is licensed for the shared task purposes only

Baselines and Resources

  • Baseline models and evaluation scripts will be provided upon data release
  • Suggested toolkits and example code will be made available

Timeline

1.

First Call for Participation (Registration)

August 01,2025
2.

Second Call for Participation (Registration)

August 28,2025
3.

Release of Sample Submission Dataset

September 04,2025
4.

Release of Training Dataset

September 25,2025
5.

Release of Test Dataset

October 04,2025
6.

System Submission Deadline

October 10,2025
7.

Result announcement

October 17,2025
8.

Final Papers Submission Deadline

October 24,2025
9.

Notification of acceptance/writing papers for shared task

November 03,2025
10.

Camera Ready Papers Due

November 11,2025
11.

Pre-recorded Video Due

November 21,2025
12.

Workshop Date

December 23,2025

Important: Registration post-test data release will not be accepted.

Invited Speakers

Tanmoy Chakraborty
Tanmoy ChakrabortyIIT Delhi

Workshop Schedule (Tentative)

1. Opening Ceremony & Welcome Address

Setting the stage for the workshop

[10:00 am to 10:30 am]

2. Keynote Address 1

[10:30 am to 11:20 am]

3. Research Paper Presentations

Workshop papers and shared task submissions - 6 presentations (15 minutes each)

[11:30 am to 1:00 pm]

4. Lunch Break

[1:00 pm to 2:00 pm]

5. Keynote Address: 2

[2:00 pm to 2:45 pm]

6. Panel Discussion

Invited system demonstration/industry talk on AI for Healthcare

[2:45 pm to 3:30 pm]

7. Interactive Poster Session & High Tea

[3:30 pm to 4:30 pm]

Expected Outcomes

  • By the end of the workshop, participants will have a deeper understanding of the current challenges and opportunities in deploying language technologies in healthcare.
  • We expect that the workshop will help create a community of interdisciplinary experts who can then engage together to find meaningful solutions to existing challenges in healthcare communications with the use of NLP, AI and other language technologies.
  • The workshop will provide potential directions for future research and collaboration.

Program Committee Members

  • Miguel Angel Rios Gaoana, University of Vienna
  • Vincent Briva Iglesias, School of Applied Language and Intercultural Studies (SALIS), Dublin City University
  • Sara Vecchiato, Università degli Studi di Udine
  • Sneha Mithun, Tata Memorial Hospital, Mumbai
  • Ashish Kumar Jha, Tata Memorial Hospital, Mumbai
  • Dr. Dilip Abraham, Christian Medical College, Vellore
  • Sivakumar Balasubramanian, Christian Medical College, Vellore
  • Sonish Sivarajkumar, School of Computing and Information, University of Pittsburgh
  • Asif Ekbal, Department of Computer Science and Engineering, IIT Patna

Organizers

Arun Zechariah
Arun ZechariahCMC Vellore
Balukrishna S
Balukrishna SCMC Vellore
Dipti Misra Sharma
Dipti Misra SharmaIIIT Hyderabad
Hannah Thomas
Hannah ThomasCMC Vellore
Joy Mammen
Joy MammenCMC Vellore
Parameswari Krishnamurthy
Parameswari KrishnamurthyIIIT Hyderabad
Vandan Mujadia
Vandan MujadiaIIIT Hyderabad