8–12 Oct 2024
Hotel Croatia
Europe/Warsaw timezone

Enhancing Japanese Lexical Networks Using Large Language Models: Extracting Synonyms and Antonyms with GPT-4o

12 Oct 2024, 10:30
30m
Ragusa Hall (Hotel Croatia)

Ragusa Hall

Hotel Croatia

Speakers

Dragana Špica Benedikt Perak

Description

This study presents an innovative approach to crafting and enhancing Japanese lexical networks by incorporating large language models (LLMs), especially GPT-4o, utilizing data from Vocabulary Database for Reading Japanese to accommodate various proficiency levels. Through this process, we extracted a total of 137,870 synonym relations and 54,324 antonym relations, forming a network comprising 104,427 nodes. A portion of the dataset underwent manual evaluation to determine the accuracy of the extracted synonym relationships, yielding an average evaluation score of 4.08 out of 5. Our findings demonstrate that employing graph-based methods enhances transparency and interpretability, allowing for the visualization of intricate semantic structures and enabling continuous updates. The study emphasizes the synergy between AIdriven data generation and traditional lexicographic expertise, offering a scalable and adaptable framework for diverse linguistic applications, with implications for computational linguistics and NLP technologies.

Co-authors

Presentation materials

There are no materials yet.