AI4Meder
返回数据资源列表
数据资源Chinese medical question-answer textChinese medical QA corpusAbout 26 million medical QA pairs开放访问

Huatuo-26M:大规模中文医学问答数据集

Huatuo-26M is a large-scale Chinese medical question-answering dataset with about 26 million QA pairs collected for medical language modeling and medical dialogue research. It is suitable for Chinese medical LLM pretraining, fine-tuning, and QA system development.

数据集默认配图 - 临床语言智能

数据资源详情

数据模态
Chinese medical question-answer text
资源类别
Chinese medical QA corpus
数据规模
About 26 million medical QA pairs
许可协议
Dataset-specific terms; see official repository
访问方式
开放访问
适用任务
medical QA training、medical language model fine-tuning、medical dialogue pretraining
来源
FreedomIntelligence Huatuo-26M GitHub