Speakers
Description
Since 2019, the Institute of the Estonian Language (EKI) has been compiling the EKI Combined Dictionary (CombiDic). Our presentation concentrates on incorporating synonyms into the CombiDic using the dictionary writing system Ekilex (Tavast et al., 2018; Tavast et al., 2020), where we have two types of synonyms – full and partial. We acknowledge that full synonymy is a rare phenomenon within a language (see, e.g., Cruse, 1986), but in our data model, words considered full synonyms are connected to the same meaning entity (share the exact same definition) and are mostly interchangeable. Partial synonyms are represented by meaning relations, which indicates that their meanings are similar, and they can be interchangeable in certain contexts. Users can see both types of synonyms in language portal Sõnaveeb1 (Koppel et al., 2019), where full synonyms are displayed prominently in bold text and partial synonyms in standard type. While full synonyms are added mainly by the team working on sense division, our team focuses on adding partial synonyms in a specially developed taskoriented interface of Ekilex, where a lexicographer is presented with the sense division of a given headword and an automatic list of synonym candidates. When two headwords share similar meanings, compilers can simply drag and drop the corresponding senses, thereby establishing a meaning relation that is displayed as a partial synonym within the interface. In Ekilex, all synonyms are bidirectional (if A = B, then B = A). Whenever full synonyms are created by connecting words to the same meaning, the system displays them as a cluster (if A = B and B = C, then A = B = C). If, e.g., a lexicographer includes A as a partial synonym to D, B and C in the cluster will be partial synonyms to D automatically. Then again, creating a meaning relation between partial synonyms only creates a bidirectional relation between two senses and does not connect other partial synonyms to the cluster. Occasionally, bidirectional compilation may present both semantic as well as technical difficulties for lexicographers, including the following.
Inconsistencies in sense division between synonymous words. The benefits of adding partial synonyms include various other findings within the data, such as identifying missing senses and inconsistencies in the sense division of similar words. • Broader and narrower senses. This is a question that particularly concerns partial synonyms. It is inherent for partial synonyms to be synonymous in a particular aspect, but not in the scope of the whole given dictionary sense. It may be useful for lexicographers to group different usage possibilities into a single sense, but this can present a problem from the point of view of adding synonyms. For example, combining different types of movement under one sense for the verb tantsima ‘dance’ may seem economical, but adding synonyms in a single row for “dancing” of birds, animals and inanimate objects (e.g., a plastic bag in the wind) will create a long disconnected list. • Fuzzy lines between synonymy and hyperonymy. For example, a hyponym can be partially synonymous with its hyperonym, e.g, T-särk ‘T-shirt’ = särk ‘shirt’, triiksärk ‘dress shirt’ = särk, etc. Adding hyperonyms as synonyms to each hyponym, a list of hyponyms will be shown as synonyms to the hyperonym. • Parts-of-speech issues arise when a word has shifted out of its paradigm in a particular meaning and has acquired the characteristics of another word category. For example, null ‘zero’ (numeral/noun) in the phrase tuju on nullis ‘the mood is at zero” can semantically be viewed as a partial synonym with adjectives like olematu ‘non-existent’ and halb ‘bad’. It might confuse the users of Sõnaveeb when there is suddenly a numeral/noun in a list of adjective synonyms. It is difficult to present grammatical and syntactic synonymy because not all potential synonyms are traditionally headwords in a dictionary – short stems (e.g., digitaalne (adjective) = digi- (used only in compounds) ‘digital’, inflected forms (e.g., täppidega ‘with spots’ as a synonym for täpiline ‘spotted’), etc. In our presentation, we present more examples and discuss potential solutions to the semantic and technical challenges we face with bidirectional compilation of synonyms.