数据资源Chinese conversational medical QA textChinese medical conversational QA datasetLarge-scale Chinese medical CQA dataset; see official repository开放访问 CMCQA:中文医学会话问答数据集
CMCQA is a large Chinese medical conversational question-answering dataset released with knowledge-grounded medical dialogue research. It supports medical conversation QA, knowledge-grounded response generation, and evaluation of Chinese medical dialogue systems.
数据资源Chinese medical question-answer textChinese medical QA corpusAbout 26 million medical QA pairs开放访问 Huatuo-26M:大规模中文医学问答数据集
Huatuo-26M is a large-scale Chinese medical question-answering dataset with about 26 million QA pairs collected for medical language modeling and medical dialogue research. It is suitable for Chinese medical LLM pretraining, fine-tuning, and QA system development.
数据资源Chinese consultation dialogue text with medical entity annotationsChinese medical dialogue generation datasetEntity-annotated dialogue dataset; see official repository开放访问 MedDG:实体中心中文医学对话生成数据集
MedDG is an entity-centric Chinese medical consultation dataset with domain entity annotations for medical dialogue generation. It supports entity-aware response generation, medical consultation modeling, and dialogue systems that ground responses in clinical concepts.