利用大语言模型进行信息抽取的增强提示方法探索

付艺硕; 徐本峰; 杜铭轩; 王泉; 毛震东

doi:10.52396/JUSTC-2023-0128

利用大语言模型进行信息抽取的增强提示方法探索

Exploration of augmented prompting methods for information extraction using large language models

摘要

摘要: 信息抽取（IE）旨在从原始文本中自动识别和提取出特定信息。尽管基于微调预训练语言模型的解决方案层出不穷，但由于训练数据的稀缺，在少样本和零样本场景下的信息抽取任务仍然面临着巨大的挑战。另一方面，大语言模型（LLMs）可以通过少量示例甚至零样本指令很好地泛化全新的任务，并在广泛的自然语言理解或生成任务中表现出色。然而，目前尚不清楚这种有效性是否可以在信息抽取任务中复制，因为目标任务会涉及专业模式和相当抽象的实体或关系概念。本文首先通过已建立的提示策略检验了大语言模型在执行信息抽取任务中的有效性，进一步提出了多种增强提示方法，包括结构化基础提示（SFP）、结构化交互推理提示（SIRP）和基于投票机制的结构化交互推理提示（VESIRP）。实验结果表明这些提示方法实现了与需要大规模训练样本的最先进方法相当甚至更好的性能（例如在零样本 FewNERD 和 FewNERDINTRA 上）。这是对指令遵循大语言模型用于信息抽取任务的系统性探索之一。本研究不仅为此新范式确立了效果基准，更重要的是，通过所提出的提示增强方法验证了一条切实可行的技术路径，为低资源场景下的高效信息抽取提供了实用性方案。

Abstract: Information extraction (IE) aims to automatically identify and extract information about specific interests from raw texts. Despite the abundance of solutions based on fine-tuning pretrained language models, IE in the context of few-shot and zero-shot scenarios remains highly challenging due to the scarcity of training data. Large language models (LLMs), on the other hand, can generalize well to unseen tasks with few-shot demonstrations or even zero-shot instructions and have demonstrated impressive ability for a wide range of natural language understanding or generation tasks. Nevertheless, it is unclear, whether such effectiveness can be replicated in the task of IE, where the target tasks involve specialized schema and quite abstractive entity or relation concepts. In this paper, we first examine the validity of LLMs in executing IE tasks with an established prompting strategy and further propose multiple types of augmented prompting methods, including the structured fundamental prompt (SFP), the structured interactive reasoning prompt (SIRP), and the voting-enabled structured interactive reasoning prompt (VESIRP). The experimental results demonstrate that while directly promotes inferior performance, the proposed augmented prompt methods significantly improve the extraction accuracy, achieving comparable or even better performance (e.g., zero-shot FewNERD, FewNERD-INTRA) than state-of-the-art methods that require large-scale training samples. This study represents a systematic exploration of employing instruction-following LLM for the task of IE. It not only establishes a performance benchmark for this novel paradigm but, more importantly, validates a practical technical pathway through the proposed prompt enhancement method, offering a viable solution for efficient IE in low-resource settings.

HTML全文

参考文献(53)

施引文献

资源附件(1)