Nov 17 – 20, 2025
Bled, Slovenia
Europe/Ljubljana timezone

LLMs and Lexicography at the Dutch Language Institute

Nov 19, 2025, 9:00 AM
1h
Arnold hall

Arnold hall

Speakers

Carole Tiberius Jesse de Does

Description

The Dutch Language Institute (INT) has a long tradition compiling historic and contemporary dictionaries and other types of lexicographic databases, mainly for Dutch but also for some other languages with a relation to Dutch. Lexicographic work at the institute is computer-supported but there is still a great deal of manual work involved. Therefore, INT is exploring how new technologies (including LLMs) can be used for optimising different parts of the lexicographic work without compromising data quality and reliability. After a brief overview of various pilot studies conducted at the institute, we will take a closer look at how we can make the implementation of Hanks’ Corpus Pattern Analysis procedure (as it is used in the context of the project Woordcombinaties) more intelligent. This way, we hope to ultimately realise Patrick Hanks’ vision that “it seems likely that a large part of the work that is currently being carried out by hand will be automated in the not-too-distant future” (Hanks 2013;247).

Presentation materials

There are no materials yet.