Conveners
Paralel Sessions: Session 3. Specialized Lexicography โ Terminology and Terminography
- Pamela Faber
Paralel Sessions: Session 2. Bi- and Multilingual Lexicography
- Lana Hudeฤek
Paralel Sessions: Session 1. Dictionary Use
- Iztok Kosem
Paralel Sessions: Session 3. Semantics
- Boris Kern
Paralel Sessions: Session 1. Lexicography and Semantics
- Rute Costa
Paralel Sessions: Session 2. Bi- and Multilingual Lexicography
- Tinatin Margalitadze
Paralel Sessions: Session 1. Lexicography and Language Technologies
- Robert Lew
Paralel Sessions: Session 2. Lexicography and Neologisms
- Kristina Koppel
Paralel Sessions: Session 3. Dictionary (in) Use
- Carole Tiberius
Paralel Sessions: Session 1: Lexicographic Projects Reports
- Ilan Kernerman
Paralel Sessions: Session 3: ASHDRA Award Holders and Online Presentations
- Julie Moore
Paralel Sessions: Session 2: Lexicography and Language Technologies (Slavic languages)
- Ana Ostroลกki Aniฤ
Paralel Sessions: Session 3. Lexicographic Projects Reports
- Ana Salgado
Paralel Sessions: Session 2. Lexicography and Semantics
- Hans C. Boas
Paralel Sessions: Session 1. Lexicography and Large Language Models
- Philipp Stรถckle
Paralel Sessions: Session 1. Lexicography and Large Language Models
- Anas Fahad Khan
Paralel Sessions: Session 2. Lexicography and Semantics
- ล pela Antloga
Paralel Sessions: Session 3. Specialized Lexicography
- Pilar Leรณn Araรบz
Paralel Sessions: Session 3. Specialized Lexicography โ Terminology and Terminography
- Milica Mihaljeviฤ
Paralel Sessions: Session 1. Lexicography and Semantics
- Christian-Emil Ore
Paralel Sessions: Session 2. Lexicography and Language Technologie
- Ivana Braฤ
Paralel Sessions: Session 3. Metaphor, Phraseology
- Polona Gantar
Paralel Sessions: Session 2. Semantics and Dictionaries
- Andrea Abel
Paralel Sessions: Session 1. Lexicography and Language Technologies
- Besim Kabashi
Paralel Sessions: Session 3. Historical Lexicography
- Theodorus Fransen
Paralel Sessions: Session 1. Gender and Age in Dictionaries
- Lynne Bowker
Paralel Sessions: Session 2. Lexicography and Semantics
- Ivana Filipoviฤ Petroviฤ (Croatian Academy of Sciences and Arts)
Paralel Sessions: Session 2. Lexicography and Grammar
- Annette Klosa-Kรผckelhaus
Paralel Sessions: Session 1. Dictionary Writing Systems and Lexicographic Tools
- Miloลก Jakubรญฤek
Paralel Sessions: Session 3. Lexicographic and Lexicological Projects
- Tanara Zingano Kuhn
Sign language lexicography, a nascent subfield, remains relatively unexplored, primarily owing to the unique attributes of sign languages (McKee & Vale, 2017). The scarcity of sign language dictionaries is attributed to linguistic, financial, and social challenges (Vacalopoulou, 2020), with limited resources available since the pioneering Dictionary of American Sign Language on Linguistic...
The COVID-19 pandemic has impacted numerous sectors at different levels and has imposed a radical change in the pace of life in societies across the globe. A partially technical vocabulary related to COVID-19 quickly became part of everyday life, introduced mainly by news and official bodies. To describe the characteristics of the terminology being disseminated in Brazil, the project Study and...
Pashto. Pashto is an Eastern Iranian language spoken in Afghanistan, Pakistan and by a large diaspora community across the globe. It is one of two official languages of Afghanistan and a regional official language in Pakistanโs Khyber Pakhtunkhwa Province. With about 15 million native speakers in Afghanistan and 30 million in Pakistan it is the second-most spoken Iranian language after...
Portuguese is the official language of nine countries and one territory. However, given the socio-historical contexts of these countries, its functional status varies greatly. In Brazil, Portugal, and Sรฃo Tomรฉ and Prรญncipe (Hagemeijer, 2018), Portuguese is the mother tongue for the majority of the population. In Angola and Mozambique, it is the majority vehicular language, typically as a...
We would like to introduce the results of the ELDI project (Electronic Lexical Database of Indo-Iranian Languages, Pilot module: Persian), launched in August 2020. One of the aims of the project was promoting the use of technologies in teaching languages. A website and a mobile application with the PersianโCzech dictionary were developed as the main planned results of the project. A new...
The objective of this study is to investigate how learners of Italian as a second or foreign language search for new meanings in online Italian dictionaries. Using eye-tracking technology, we carried out experiments inviting users to do exercises on โcombinations of wordsโ while they consulted various dictionaries, including De Mauro โ Internazionale and Garzanti Linguistica. Results should...
This paper presents an innovative lexicographic approach embedded within an online resource currently under developement: ALMA: Linguistic Multimedia Atlas of Bio/Cultural Food Diversity. ALMA serves a dual purpose: firstly, to showcase linguistic diversity through culinary practices, and secondly, to scrutinise food marketing strategies through the analysis of language and paralanguage on...
A discussion of technical and editorial considerations in producing a 1800page hardback dictionary containing 30,000 headwords from an online database of 48,000 headwords. 248The Concise English-Irish Dictionary (CEID), published in 2020 and the first major English-Irish dictionary published in print form since the 1950s, is a 1800page hardback dictionary containing 30,000 headwords and 80,000...
The recent development of the Curriculum for Teaching Greek as a Heritage Language: A Framework for Teachers underscored the need for a dictionary to serve as supplementary material during the curriculumโs implementation at Greek Community schools in the USA. This presentation aims to introduce the Greek Heritage Language Learnersโ Illustrated Lexicon (Helix), an online, bilingual, illustrated...
Although the โCircular Economyโ has been widely discussed in the media for years, general dictionaries still do not provide the relevant definitions and/or collocations. We show by examining dictionary definitions that many salient words used in this field have undergone varying degrees of semantic broadening in the 252general language. Current terminological needs often dictate more precise...
The paper presents a New Serbian-Russian dictionary and the main principles of its development. We use the most recent explanatory dictionary of Serbian, published by the Serbian Academy of Sciences and Arts in 2018, as a starting 253point. However, we refine both the word list and the entry structure to meet the requirements of a bilingual edition. We consult text corpora of modern Serbian to...
In this article, we return to a classic lexicographical topic and address some aspects involved in the practice of defining. Digital developments have increasingly required dictionary definitions to operate independently of others if they are to be utilised in new contexts, possibly even detached from the original dictionary presentation. We examine two types of definitions where the problem...
In 2023, the Institute of the Estonian Language, in collaboration with the Center for Applied Anthropology of Estonia, conducted a user experience survey aimed at understanding the habits, needs, and attitudes of users of the language portal Sรตnaveeb (โWord Webโ) and preparing for the publication of the Dictionary of Standard Estonian (DSE) in 2025. This paper addresses prescriptive and...
It may seem obvious to state that tracing the history of a language involves consulting lexicographical works of all kinds, but the truth is that specialized lexicographical compilations, i.e., those referring to the specialized languages of a particular field of knowledge, have not always been duly considered in the diachronic study of language. In this contribution, we aim to present a...
Introduction The research approach to semantic development in first language acquisition (FLA) remains predominantly enclosed in traditional lexicographic terms and notions (usually simplified). This viewpoint doesnโt adequately span the lexical systemโs complexity and fails to present the mechanisms and processes involved in its development. Since FrameNet possesses standardized methods and...
Some (prescriptive) dictionaries do not include recently borrowed lexemes, while other descriptive ones treat them like older words or (โnativeโ) neologisms formed within the given language. The question of inclusion/exclusion is especially relevant in cases where a โnativeโ neologism in a language and a newly borrowed word are in fact (near)-synonyms; compare, for example, German downloaden โ...
The paper presents a project devised by Georgian and Hungarian lexicographers which aims at improving dictionary use skills and dictionary culture in Georgia and Hungary. The project is based on previous experience, studies and findings of its authors at Ilia State University (Georgia) and Kรกroli Gรกspรกr University of the Reformed Church in Hungary. The feedback gathered from theoretical...
The European Network on Lexical Innovation (ENEOLI, CA22126 โ www.cost. eu/actions/CA22126/, October 2023 โ October 2027) is a COST Action seeking to address the lack of comprehensive, multilingual, and globally focused research on neology. As of July 2024, 252 members from 48 different countries have been participating in the Action. The main goal is to establish a network of researchers...
In this paper, we report on our development of a multi-level analysis framework that allows us to assess AI-generated lexicographic texts on both a quantitative and qualitative level and compare them with human-written texts. We approach this problem through a systematic and fine-grained evaluation, using dictionary 254articles created by human subjects with the help of ChatGPT as an example....
The paper outlines one of the results of the project dedicated to one of the endangered Kartvelian languages, especially Megrelian. Providing data collection and documentation through fieldwork implemented in Samegrelo (Georgia), the project aims to comprehensively document the Megrelian language and encompasses the development of the annotated corpus, sketch grammar, and a bilingual...
About DANTE DANTE (Database of Analysed Texts of English) was initially developed in the years 2008โ2010 (Atkins, Kilgarriff & Rundell, 2010) by a lexicographic team led by Sue Atkins, Adam Kilgarriff, Valerie Grundy and Michael Rundell. It was commissioned by Foras na Gaeilge, a governmental agency promoting the use of Irish language, for the purposes of the development of the New English...
Czech Dictionary Express has been introduced as a project of a semiautomatically made dictionary of the Czech language. The Dictionary Express method (formerly known as rapid dictionaries) has been used for several different languages. In this paper, we analyse the automatic and manual tools used in Czech Dictionary Express and inspect the statistical and qualitative data such tools provide....
Introduction In 1911, Berlin missionary Karl Heinrich Julius Endemann, published his dictionary of the Sotho language Wรถrterbuch der Sotho Sprache, 1911. This dictionary faced scholarly neglect due to its rare combination of source and target languages, i.e., Sotho and German respectively, and also its missionary focus. Obsolete orthography, high user skill demands, and a lack of alignment...
The objective of this paper is to illustrate, through the examination of sample entries, the methodology employed in the creation of a prospective pilot corpusbased dictionary of Serbian as a second language, drawing on advancements applied in other similar projects for different languages (e.g., Franรงois et al., 2014; Franรงois et al., 2016; Klemen et al., 2023). While Serbian is spoken as the...
Spoken language is the prerequisite of written standard languages for living language communities. Yet written sources dominate lexicographic description of standard languages, and awareness of dictionaries that specifically source speech seems limited. In Norsk Ordbok (The Norwegian Dictionary), and in the Language Collections on which the dictionary is based, oral materials are perceived as...
Good dictionary examples are hard to come by. Despite corpora growing larger and larger, lexicographers still have difficulties in finding good candidate sentences for exemplifying how the dictionary headwords are used in context. There are automatic methods available to address this time-consuming task. One such method is GDEX, a feature of the Sketch Engine tool (Kilgarriff et al., 2004),...
The paper reports a pilot study on the detection of lexical semantic variation in modern Swedish. The starting point of the study is the meaning descriptions of around 65,000 headwords in โThe Contemporary Dictionary of the Swedish Academyโ (SO, 2021) covering approximately 100,000 different senses. In our work, we aim to explore the potential of the latest computational methods to discover...
This presentation outlines the development process of DICIENS, a bilingual school science dictionary (English-Spanish/Spanish-English) designed for primary education students in Spain. DICIENS marks a pioneering initiative, filling a significant gap in educational resources and pedagogical lexicography. Rooted in the theoretical framework of Frame-based Terminology (Faber 2009, 2012), this...
This communication aims at discussing how syntagmatic constraints in the lexicon can be provided in lexicographic resources more effectively than has been done to date, covering a wide range of multi-word expressions: from compounds to collocations and phrasemes. Examples are taken from the ongoing implementation of a multilingual specialised resource called ALMA โ Multimedia Linguistic Atlas...
Diretes is a Spanish monolingual e-dictionary based on Lexical-Semantic Relations which are formalized by Lexical Functions, a formal tool explored within the Meaning-Text Theory. This dictionary consists of a relational database which aims to reflect the cognitive links of the lexicon through a network of semantic and lexical associations. Currently it contains more than 100,000 collocations...
We present a study of Danish multiword constructions containing one or more hyphens, such as gas- og vandmester (โgas- and water.repairmanโ; โplumberโ), ilt- og brintatomer (โoxygen- and hydrogen atomsโ) and haveborde og โstole (โgarden tables and -chairsโ). Although materially analogous, such constructions exhibit different semantics, falling โ as we shall argue โ into two distinct groups...
Terminology within the domain of environmental economics, a rapidly growing and changing sub-discipline of economics concerned with environmental issues, has been understudied in the literature on domain-specific languages. While, on the one hand, it presents the common features of specialized vocabulary, i.e., monoreferentiality, precision, economy and objectivity (Gotti, 2008; Scarpa, 2020),...
A common issue in Corpus Linguistics is assessing representativeness and balance of a corpus (McEnery & Hardie, 2011). Biber (1993, p. 244) defines representativeness as โthe extent to which a sample includes the full range of variability in a population.โ Assessment has been traditionally tackled quantitatively and qualitatively both in monolingual and bilingual settings (Stefanowitsch,...
Since 2019, the Institute of the Estonian Language (EKI) has been compiling the EKI Combined Dictionary (CombiDic). Our presentation concentrates on incorporating synonyms into the CombiDic using the dictionary writing system Ekilex (Tavast et al., 2018; Tavast et al., 2020), where we have two types of synonyms โ full and partial. We acknowledge that full synonymy is a rare phenomenon within a...
The work with historical documents presents many challenges, not only because some sources are not well preserved, but also because grammar and spelling rules from older times were not always consistent. Still, these texts remain as a rich source of information from our history, and we could greatly benefit from the information that can be extracted from them. At the same time, the lack of...
Existing research on ChatGPT in lexicography is undoubtedly valuable. However, it has tended to focus on metalexicographic concerns rather than effectiveness in resolving user queries directly. Moreover, it has mostly dealt with general-purpose English lexicography, often ignoring other languages and specific purposes. Focussing on 33 L1 Spanish users completing an introductory training course...
Medicine is one of the specialized domains that is of particular interest to different communities of speakers, most of whom cannot be considered experts or semi-experts. Their interest in the domain lies in the fact that a certain level of medical knowledge is needed in everyday life, much like a basic understanding of legal concepts. As a prominent characteristic of the domain,...
In our poster presentation, we will present the results of the experiment that tests the potential of large language models (LLMs) in semantic analysis of Estonian. We will focus on LLMsโ ability to analyse polysemy and create definitions. In 2024, the Institute of the Estonian Language started a new project in which we are exploring how LLMs, such as GPT, can help with the presentation of...
Corpus Pattern Analysis, CPA, is a technique for identifying local semantic and syntactic information of a word and map it to its meanings. In verbs, it consists basically of the argument structure labelled with semantic types for each argument. CPA is used in several dictionary projects and allows systematic corpus analysis; however, it is extremely time-consuming. In this paper, we present a...
A medication package insert is a legal healthcare document with important information about medications. In Brazil, the National Health Surveillance Agency (ANVISA) requires two versions of the package insert: one for patients and another one for healthcare professionals. In this study, we manually evaluated the performance of an automatic frame annotator on a corpus consisting of 100...
The paper details the current state of an ongoing collaboration between Hungarian lexicographers and computational linguists. Our goal is to provide a comprehensive and consistent description of Hungarian adjectives, benefiting lexical semantics, lexicography and NLP. This thread of research focuses on identifying systematic semantic patterns of Hungarian adjectives and their typical...
This research proposes a step forward in the automatic identification and analysis of verbal idioms in Croatian. The use of the NooJ automated text processing tool, along with the MaCoCu corpus and the Online Dictionary of Croatian Idioms (ODCI), provides a robust framework for recognizing and categorizing these multi-word expressions (MWEs). The research comprises two parts: (a) creation of a...
Among historical and ancient languages, usually under-resourced due to the limited size of corpora and the scarce availability of digital lexical resources, Latin is relatively well documented, thanks to its high relevance to the history of Europe and to the study of Romance languages. As far as lexicography is concerned, several lexical resources are available in digital format, although this...
In the revision process of dictionaries, adding new headwords or new senses to already existing headwords is what typically receives the most attention. In this article, we bring into focus the intriguing dilemma of exclusion of headwords from the Swedish Academy Glossary (SAOL), which is still published in print versions. In the e-dictionary-era, removing headwords may seem questionable, SAOL...
In the introductory part of the presentation, the authors will present the Croatian Web Dictionary โ Mreลพnik project (Hudeฤek & Mihaljeviฤ, 2020; Hudeฤek, Mihaljeviฤ & Joziฤ, 2024). Mreลพnik consists of three modules โ the module for adult native speakers of Croatian, the module for students and the module for non-native speakers learning Croatian. These modules have different approaches to...
While there has been a number of projects focusing on early medieval Irish lexicography (Griffith et al., 2018), few have aspired to work towards comprehensive interlinking of textual and lexical resources. This is at least in part due to the morphological complexity and variation in Early Irish (c. 600โ1200CE), compounded by the absence of an orthographic standard (Stifter, 2009). The...
CHAMUรA (Cultural HeritAge and Multilingual Understanding through lexiCal Archives) is a pioneering initiative aimed at exploring the impact of the Portuguese language on Asian languages, rooted in the historical exchanges between Portuguese traders, colonists, and diverse Asian cultures. The impact of these interactions extends beyond historical remnants to the modern-day lexicon of Asian...
he semantics of body part nouns is particularly fascinating from the point of view of the evolution of word meanings, their metaphorical and metonymic derivation and, from a cross-linguistic standpoint, for the large amount of overlap between different languages. The status of BODY as a semantic prime, especially in the Natural Semantic Metalanguage (NSM) (cf. Wierzbicka, 2014; 2007) has never...
This article presents semantic information about contemporary standard Slovenian on the Franฤek educational language portal, which is aimed at primary and secondary-school students. The portalโs primary role is to enhance studentsโ dictionary skills as part of the national language education program and to introduce users to other linguistic resources, such as school grammars. The portal...
Historical language data can give us an insight into the conceptual and everyday world of past times. However, this insight very often only related to a small group of the society with a strong political and social influence. What the linguistic and social situation looked like for the majority of the population can usually only be guessed through the interpretation of others, as only a small...
What strategies are currently being applied in electronic dictionaries and terminology databases to gender representation, with a particular focus on feminine agentives? Starting with an overview of the state of the art as to gender studies in lexicography and terminology, in this paper we reflect upon the analyzed approaches in electronic dictionaries and terminology databases, collecting...
This paper accounts for a system of semantic fields that was developed in Iceland around the turn of the century. The purpose of the system was to help describe the semantic properties of the Icelandic vocabulary and to be a practical tool in lexicographic work. The system categorizes words into semantic fields, enabling nuanced organization and practical applications in monolingual and...
This article discusses the project, Dictionary of the Dubrovnik Idiom, conducted at the Institute for the Croatian Language. The project aims to develop a borndigital diachronic dictionary of the Dubrovnik idiom, covering the period from the 16th century to the end of the 20th century. The dictionary will be based on a historical corpus compiled within the projectโs scope, featuring texts from...
The paper presents a count-based semantic vector space model for Ukrainian, which has been applied for the semantic change detection task. The approach assumes creation of multidimensional vector representations of occurrences for a particular lexeme or a group of related lexemes with further visual and quantitative analysis of the obtained semantic vector space. The multidimensional space has...
Cross-lingual embedding models act as facilitator of lexical knowledge transfer and offer many advantages, notably their applicability to low-resource and nonstandard language pairs, making them a valuable tool for retrieving translation equivalents in lexicography. Despite their potential, these models have primarily been developed with a focus on Natural Language Processing (NLP), leading to...
Dictionaries have traditionally served as more than mere repositories of words; they have aimed to sketch some of the relationships between words, including semantic, collocational, or hierarchical connections. However, the physical constraints of print media often limited their scope, restricting the depiction of these relationships to cross-references, exemplifications, and, in specialized...
The LBC-Platform (https://www.lessicobeniculturali.net) is a comprehensive lexical information system that aims to integrate various types of corpora and resources: dictionaries, concordances, monolingual Language for Special Purposes (LSP) corpora in different languages and LSP parallel corpora. Designed for users interested in cultural heritage, the platform provides free access to resources...
Despite the decreasing use of regional and local varieties of the Dutch language, there is a growing public interest in dialects in the Netherlands and Flanders. Several dialect associations strive to preserve the local dialect by creating lexicons, establishing spelling conventions, writing texts in their local dialect, teaching the dialect, and sharing knowledge about their local dialect...
We present the COR.SEM lexicon, an open-source semantic lexicon for general AI purposes funded by the Danish Agency for Digitisation as part of an AI initiative embarked upon by the Danish Government in 2020. COR.SEM describes the core senses of 34,000 Danish lemmas with formal semantic information, e.g., ontological type, hypernym, semantic frame, regular polysemy pattern, and polarity value;...
This study presents an innovative approach to crafting and enhancing Japanese lexical networks by incorporating large language models (LLMs), especially GPT-4o, utilizing data from Vocabulary Database for Reading Japanese to accommodate various proficiency levels. Through this process, we extracted a total of 137,870 synonym relations and 54,324 antonym relations, forming a network comprising...
NomVallex is a manually annotated valency lexicon of Czech nouns and adjectives that enables research into various language phenomena related to valency, including the comparison of valency properties of affirmative and negative forms of words. This paper presents new developments in the way the lexicon facilitates research into word-level negation, explaining the reasoning behind the proposed...
In this paper, we explore the possibilities and challenges of lexicographic treatment of pragmatic markers, specifically epistemic and evidential markers in Czech. Our starting point is a detailed comparison of how these expressions are treated in contemporary monolingual Czech dictionaries. Following this, we present the development of the SEEMLex lexicon of Czech epistemic and evidential...