GPT-RE: In-context Learning for Relation Extraction using Large Language Models
GPT-RE: In-context Learning for Relation Extraction using Large Language Models

概要

In spite of the potential for ground-breaking achievements offered by large language models (LLMs) (e.g., GPT-3) via in-context learning (ICL), they still lag significantly behind fully-supervised baselines (e.g., fine-tuned BERT) in relation extraction (RE). This is due to the two major shortcomings of ICL for RE: (1) low relevance regarding entity and relation in existing sentence-level demonstration retrieval approaches for ICL; and (2) the lack of explaining input-label mappings of demonstrations leading to poor ICL effectiveness. In this paper, we propose GPT-RE to successfully address the aforementioned issues by (1) incorporating task-aware representations in demonstration retrieval; and (2) enriching the demonstrations with gold label-induced reasoning logic. We evaluate GPT-RE on four widely-used RE datasets, and observe that GPT-RE achieves improvements over not only existing GPT-3 baselines, but also fully-supervised baselines as in Figure 1. Specifically, GPT-RE achieves SOTA performances on the Semeval and SciERC datasets, and competitive performances on the TACRED and ACE05 datasets. Additionally, a critical issue of LLMs revealed by previous work, the strong inclination to wrongly classify NULL examples into other pre-defined labels, is substantially alleviated by our method. We show an empirical analysis.

産業界への展開例・適用分野

Large Language Models, Information Extraction

研究者

氏名	コース	研究室	役職/学年
WAN Zhen	知能情報学コース	言語メディア	博士1回生
黒橋禎夫	知能情報学コース	言語メディア	教授
CHENG Fei	知能情報学コース	言語メディア	助教

Web Site

https://arxiv.org/abs/2305.02105

GPT-RE: In-context Learning for Relation Extraction using Large Language ModelsGPT-RE: In-context Learning for Relation Extraction using Large Language Models

概要

産業界への展開例・適用分野

研究者

Web Site

GPT-RE: In-context Learning for Relation Extraction using Large Language Models
GPT-RE: In-context Learning for Relation Extraction using Large Language Models