Nov 17 – 20, 2025
Bled, Slovenia
Europe/Ljubljana timezone

Matching meaning: Evaluating ChatGPT’s ability to assign corpus examples to dictionary senses of polysemous sound-related verbs

Nov 19, 2025, 3:00 PM
30m
Zrak hall

Zrak hall

Speaker

Sylwia Wojciechowska

Description

ONLINE PRESENTATION

A major change in dictionary exemplification was brought about by the arrival of corpus data, which replaced lexicographer-made examples with authentic ones from real spoken and written discourse. Monolingual English learners’ dictionaries (MELDs) prefer a third type of examples, corpus-based ones, with unnecessarily complex vocab and structure, and unclear content removed from them. However, apart from corpus-based examples which follow definitions of senses, MELDs online include sections of non-modified corpus examples placed usually at the bottom of entries and not matched with any senses.

The paper aims to explore corpus examples sections accompanying polysemous sound-related verbs and leverage ChatGPT-4 to match corpus examples with the senses already distinguished in the respective dictionary entries. The verbs were selected from the twelve strongest and forty-four strong synonym matches of the verb 'sound' in the sense “produce noise” on Thesaurus.com. Apart from the basic, literal meaning, each of these verbs has a figurative, metaphorical meaning or meanings, e.g. echo “to repeat opinions in agreement”, and resonate “to receive a sympathetic response”. Learners’ dictionaries were chosen for analysis, as exemplification is particularly important in them. The selected MELDs are Longman Dictionary of Contemporary English (LDOCE), Cambridge Advanced Learner’s Dictionary (CALD) and Collins Dictionary (Collins), as they all have sections dedicated to corpus examples. CALD and Collins explicitly inform the user that the examples have been automatically selected, and therefore the editors do not take responsibility for possible sensitive content or mismatches with the entry word.

The present study demonstrates that ChatGPT is successful at separating literal from metaphorical examples of sound-related verbs, which is not surprising, as current research indicates the capability of Large Language Models (LLMs) for polysemy and metaphor identification and interpretation (e.g. Bond et al. 2024 and Lin et al. 2024). The performance of ChatGPT is then checked in a more challenging task, that of matching corpus examples with the already existing senses in each of the analysed dictionaries. The prompts include the numbered senses that feature in the dictionaries under a certain headword together with the definitions and accompanying examples, which serve as models for ChatGPT.

The corpus examples sections in the dictionaries tend to be rather lengthy, especially in CALD, and, for instance, at the entry for 'resonate' they amount to 104 examples. Therefore, the task of assigning corpus examples to separate senses would be drudgery for human lexicographers. In online dictionaries, such corpus examples can be located below corpus-based examples in expandable boxes, a practice which is already seen in Oxford Advanced Learner’s Dictionary for corpus-based examples. It was found that sometimes ChatGPT admits it cannot assign any corpus example to a sense, because no example demonstrates it. Such cases will be analysed with scrutiny, and ChatGPT will be asked to generate missing examples, a task which it does not turn out to be impressive at, as Lew (2023) observes.

Presentation materials

There are no materials yet.