POSTER
Writing dictionary entries is not only time-consuming but also an expensive process due to the highly specialized knowledge and experience required of the lexicographer. To facilitate the task of compiling the Danish monolingual dictionary DDO (ordnet.dk/ddo), we aim to establish an automatic assistant based on applied language technology (e.g. n-gram analysis, word embeddings, etc.)...
DEMO
CJVT igre (https://igre.cjvt.si/) is a new digital platform offering word games designed to foster lexical awareness and engagement with standard Slovene. Developed by the Centre for Language Resources and Technologies at the University of Ljubljana, the portal currently hosts three gamesโCvetka, Besedolov, and Vezalkaโwith two more in development. Each game utilizes curated lexical...
POSTER
The representation of medical adjectives in Croatian general dictionaries reveals significant inconsistencies, reflected in uneven lemma inclusion, ambigous or absent domain labels, and limited definitional precision. This paper analyzes the 80 most frequent adjectives, based on corpus data from the Croatian Medical Corpus (CMC) (Kocijan, Kurolt & Mijiฤ, 2020), in the three major...
POSTER
This paper presents a novel approach to exploring derivational families within the framework of Intelligent Lexicography, using the ล KOLARAC corpus: a collection of Croatian school essays written by L1 learners (native-speaking students) in grades 5 through 8 and enriched with metadata such as gender, grade level, and region. By combining rule-based linguistic processing in NooJ, a...
POSTER
Taking seriously the common construction grammar statement that โitโs constructions all the way downโ (Goldberg, 2006: 18), the Hungarian Constructicon aims to encompass the widest possible range of constructions. As it is a dictionary-based constructicon, it naturally contains what a dictionary can provide โ from morphemes to words, and to partially schematic multiword constructions...
POSTER
The lack of normative resources for the Croatian language has incited the development of a novel resource that would not only compile normative data for Croatian but also focus on an underrepresented group of linguistic units โ figurative multi-word (MWE) expressions. Thus, the creation of a normative database for figurative MWEs in Croatian is a significant step in the right...
POSTER
The objective of the research is to develop a technology for converting specialized dictionary text into a website with a developed user interface.
The object of the study was โDictionary of Ukrainian biological terminologyโ (7,342 entries and about 26,000 terms in Ukrainian, Russian and English), that contains definitions, terms polysemy, synonymy, stresses for Slavic languages,...
POSTER
As part of the COST Action CA21167 Universality, Diversity and Idiosyncrasy in Language Technology (UniDive), the ELEXIS-WSD Parallel Sense-Annotated Corpus (Martelli et al., 2021; ฤibej et al., 2025) is being expanded to include subcorpora in additional languagesโamong them, Croatianโas well as new annotation layers. Each language subcorpus of ELEXIS-WSD contains the same 2,024...
POSTER
In this paper, we provide a comprehensive overview of the way in which the morpho-syntactic properties of multiword expressions are represented in lexical resources to support Natural Language Processing downstream applications. Starting from an up-to-date and comprehensive overview of the existing lexica dedicated to multiword expressions and containing their syntactic description,...
POSTER
The Information and Communication Technologies (ICT) field has evolved rapidly in recent decades. Thus, to describe new devices, activities, and concepts that appear yearly, a vast number of terms are created primarily in English, while other languages rely on secondary term formation (STF) for ICT end-users (ETSI Guide, 2022). Systematic secondary rendering and dissemination...