8–12 Oct 2024
Hotel Croatia
Europe/Warsaw timezone

COR.SEM, a New Formal Semantic Lexicon for Danish

12 Oct 2024, 10:00
30m
Ragusa Hall (Hotel Croatia)

Ragusa Hall

Hotel Croatia

Speakers

Sanni Nimb Ida Flörke Sussi Olsen Bolette S. Pedersen Nathalie C. H. Sørensen

Description

We present the COR.SEM lexicon, an open-source semantic lexicon for general AI purposes funded by the Danish Agency for Digitisation as part of an AI initiative embarked upon by the Danish Government in 2020. COR.SEM describes the core senses of 34,000 Danish lemmas with formal semantic information, e.g., ontological type, hypernym, semantic frame, regular polysemy pattern, and polarity value; features which are in essence drawn and simplified from other existing resources. Lexical information from The Danish Dictionary DDO and the Danish Thesaurus DDB is also integrated, e.g., user examples, domain label, synonyms, and near synonyms. It provides direct links to synsets in the Danish WordNet DanNet, as well as to the morphological lemma information in COR, the Central WordRegister which is based on the Danish Orthographical Dictionary and DDO. The register’s common numerical index at both lemma and sense level makes it is more straightforward to merge mono- as well as bilingual dictionaries with COR.SEM and thereby inherit the formal semantic information. At the website corsem.dsl.dk it is possible to browse the lexical entries and to download tailored extracts of data of your choice. We give examples of the use of COR.SEM in linguistic studies, in NLP tasks and in lexicographic projects.

Co-authors

Presentation materials

There are no materials yet.