Speakers
Description
This paper presents two tasks involving large language models (LLMs)—Gemini-2.0-flash and GPT-4o—used to generate distractors (i.e., incorrect options) for synonym and collocation questions in a language game. The lexical data for both tasks was sourced from the Digital Dictionary Database of Slovene (DDDS). Prompts were initially tested on a sample dataset with both models, and the better-performing model was selected for each task: Gemini-2.0-flash for synonyms, and GPT-4o for collocations. Evaluation results showed strong performance of the models, with over 80% of the generated distractors rated as appropriate. Common issues included non-existent or rare words and legitimate synonyms in the synonym task, and common collocations or distractors that improperly altered collocational structure in the collocation task. Additional filtering of the data was required to ensure game readiness. Further plans include using LLMs for the production of data for other games, as well as using LLM in the preparation of lexicographic data in the DDDS.