Ngram Expands AI Dataset Offerings with Descriptive Question Answering for Healthcare

New open-source medchat-qa-descriptive dataset enables AI models to provide comprehensive answers to complex medical queries.

Ngram Expands AI Dataset Offerings with Descriptive Question Answering for Healthcare
San Francisco, CA, March 22, 2024 --( Ngram, a pioneering generative AI company in the life sciences industry, today announced the release of its medchat-qa-descriptive dataset on Hugging Face. This new dataset complements Ngram's previously launched medchat-qa dataset by providing AI models with the capability to answer complex, open-ended medical questions that require detailed, descriptive responses.

The medchat-qa-descriptive dataset contains thousands of real-world queries from healthcare professionals (HCPs) across a broad range of medical topics. Unlike factoid questions with straightforward answers, these queries demand comprehensive explanations and insights drawn from scientific literature.

"While our initial medchat-qa dataset revolutionized how AI systems handle factual medical queries, there remained a significant gap in addressing more nuanced inquiries that require deeper reasoning and context," said Anish Muppalaneni, CEO and co-founder of Ngram. "With medchat-qa-descriptive, we are empowering AI models to provide rich, substantive responses that truly meet the needs of healthcare professionals and their patients."

As Devadutta Ghat, CTO and co-founder of Ngram, explained, "Developing AI capable of understanding and responding to descriptive medical questions is a monumental challenge that requires extensive training data. By open-sourcing this new dataset, we aim to accelerate research in this critical area and drive the creation of AI assistants capable of engaging in intelligent dialogue with clinicians."

The medchat-qa-descriptive dataset features several key attributes:

- Over 1,000 open-ended questions from HCPs across various 50+ therapeutic areas.
- Detailed, multi-paragraph descriptive answers compiled from authoritative medical sources.
- Diverse query topics, including disease mechanisms, treatment rationales, and patient education.
- Freely available under an open-source license on the Hugging Face dataset repository.
- Available at this url:

By combining the new descriptive dataset with the existing medchat-qa factoid dataset, Ngram provides a comprehensive resource for evaluating AI models' ability to understand and respond to the full spectrum of medical inquiries. "The ability to provide substantive, context-rich information is crucial for healthcare professionals making treatment decisions," Anish Muppalaneni noted. "With our expanded dataset offerings, we are equipping AI to be an invaluable assistant that can quickly surface relevant insights from vast medical knowledge bases."

The release of medchat-qa-descriptive reinforces Ngram's commitment to accelerating AI innovation that enhances access to medical information and improves patient outcomes. To learn more about the new dataset and Ngram's mission, visit

About Ngram Ngram is dedicated to empowering life sciences organizations with generative AI technology. The company's solutions help medical affairs and pharmacovigilance teams instantly access pertinent information, reducing response times from days to seconds. Founded in 2022 by Anish Muppalaneni and Devadutta Ghat, Ngram has received support from prominent investors in digital health and enterprise AI. For more information, visit

Media Contact
Anish Muppalaneni
CEO, Ngram Inc
Ngram Inc.
Anish Muppalaneni