22–24 Jun 2023
Yonsei University
Asia/Seoul timezone

Identifying uncommon usages in common words with the same Chinese characters: A quantitative analysis on entities of Trilateral Common Vocabulary Dictionary

23 Jun 2023, 11:00
30m
Kwak Joung-Hwan Challenge Hall

Kwak Joung-Hwan Challenge Hall

Speaker

Li Fei

Description

The languages of China, Japan, and Korea have created a significant number of lexicons derived from similar Chinese origins because of the Sinosphere's long-lasting influence. A considerable part of them exhibits semantic similarities as the characters comprise them often originate from common sources. Despite the fact that these cross-lingual near-synonyms overlap a substantial amount in terms of dictionary definitions, it's conceivable that their usage in the different linguistic systems will vary considerably. Therefore, for the sake of better elucidation on the variations in their context-dependent usages of them, this paper conducts a quantitative comparative analysis based on more than 300 common Chinese-character entities from the Trilateral Common Vocabulary Dictionary and thousands of related concordances from Aihub's CJK parallel corpora using AI language models targeting on Dependency Parsing & Semantic Textual Similarity tasks. Interestingly, the findings demonstrate that many of these homogenous Chinese terms exhibit syntactic and semantic uncommon usages depending on respective linguistic contexts, reflecting the ongoing expansion and diversification of Chinese-character vocabularies across disparate language systems. Consequently, for better comprehension and a wider audience of the dictionary, this paper suggests incorporating these uncommon usages into the existing definitions of commonly used Chinese-character words.

Primary author

Co-author

Presentation materials

There are no materials yet.