大语言模型用于基于访谈的抑郁诊断：一种经验性研究

胡鹏博; 李宏; 李星宇; 李春晓; 周熠

doi:10.52396/JUSTC-2023-0088

大语言模型用于基于访谈的抑郁诊断：一种经验性研究

Large language model for interview-based depression diagnosis: an empirical study

摘要

摘要: 自动诊断抑郁症在防止抑郁症状恶化方面扮演着至关重要的角色。采用访谈为基础的方法是抑郁症诊断中最常采用的技术。然而，收集到的对话数据规模有限，不同参与者的样本分布通常存在巨大差异。这给构建一个合适的深度学习模型进行自动抑郁症诊断带来了巨大挑战。最近，大型语言模型在零样本和少样本情景下展示出了令人印象深刻的能力，并在各种任务中达到了与人类水平相当的表现。这为利用有限数据进行特定领域任务的AI解决方案的发展带来了新的启示。在本文中，我们提出了一种两阶段的方法，利用目前最强大和成本效益最高的语言模型ChatGPT，对基于面试的数据进行抑郁症诊断。具体而言，在第一阶段中，我们使用ChatGPT对原始对话样本进行总结，从而便于提取与抑郁症相关的信息。在第二阶段中，我们使用ChatGPT对总结的数据进行分类，以预测样本的抑郁状态。我们的方法在仅文本模态下可以在DAIC-WOZ数据集上实现约76%的准确率。此外，我们的方法在D⁴数据集中超过了最先进模型的性能6.2%。我们的工作凸显了利用大型语言模型进行基于诊断的抑郁症诊断的潜力。

Abstract: The automatic diagnosis of depression plays a crucial role in preventing the deterioration of depression symptoms. The interview-based method is the most wildly adopted technique in depression diagnosis. However, the size of the collected conversation data is limited, and the sample distributions from different participants usually differ drastically. These factors present a great challenge in building a decent deep learning model for automatic depression diagnosis. Recently, large language models have demonstrated impressive capabilities and achieved human-level performance in various tasks under zero-shot and few-shot scenarios. This sheds new light on the development of AI solutions for domain-specific tasks with limited data. In this paper, we propose a two-stage approach that exploits the current most capable and cost-effective language model, ChatGPT, to make a depression diagnosis on interview-based data. Specifically, in the first stage, we use ChatGPT to summarize the raw dialogue sample, thereby facilitating the extraction of depression-related information. In the second stage, we use ChatGPT to classify the summarised data to predict the depressed state of the sample. Our method can achieve approximately 76% accuracy with a text-only modality on the DAIC-WOZ dataset. In addition, our method outperforms the performance of the state-of-the-art model by 6.2% in the D⁴ dataset. Our work highlights the potential of using large language models for diagnosis-based depression diagnosis.

HTML全文

参考文献(20)

施引文献

资源附件(1)