22–24 Jun 2023
Yonsei University
Asia/Seoul timezone

The ROI of AI for lexicography

22 Jun 2023, 16:00
1h
Choe Yeong Hall

Choe Yeong Hall

Speaker

Erin McKean

Description

Large Language Models (LLMs) are being used for many language-based tasks, including translation, summarization and paraphrasing, sentiment analysis, and for content-generation tasks, such as code generation, answering search queries in natural language, and to power chatbots in customer service and other domains. Since much modern lexicography is based on investigation and analysis of large-scale corpora similar to the corpora used to train LLMs, we hypothesize that LLMs could be used for typical lexicographic tasks. A commercially-available LLM API (OpenAI’s ChatGPT gpt-3.5-turbo) was used to complete typical lexicographic tasks, such as headword expansion, phrase and form finding, and creation of definitions and examples. The results showed that the output of this LLM is not up to the standard of human editorial work, requiring significant oversight because of errors and “hallucinations” (the tendency of LLMs to invent facts). In addition, the externalities of LLM use, including concerns about environmental impact and replication of bias, add to the overall cost.

Primary author

Co-author

Presentation materials

There are no materials yet.