8–12 Oct 2024
Hotel Croatia
Europe/Warsaw timezone

Making Danish Thesaurus Data Available to Researchers: The WebDDB Project

9 Oct 2024, 17:00
1h 30m
Tihi salon (Hotel Croatia)

Tihi salon

Hotel Croatia

Speakers

Sanni Nimb Nathalie C. H. Sørensen Jonas Jensen

Description

This study presents a project aiming to make thesaurus data available under an academic licence. The project is based on the printed thesaurus Den Danske Begrebsordbog (DDB) which covers approx. 80% of the Danish dictionary DDO (ordnet.dk/ddo). It presents more than 100,000 different words and expressions categorised and ordered semantically in 22 thematic chapters, and 888 named sections. The data is now downloadable at a webpage where it can be supplemented with different types of lexical information from other resources of choice, e.g., information on valency, etymology, or ontological type. The supplementation is possible due to shared sense id-numbers between the lemmas in the digital thesaurus manuscript, the Danish online dictionary DDO, the semantic lexicon COR.SEM, and a WordNet (DanNet). The webpage allows for new types of studies of the Danish vocabulary with semantic similarity as the starting point. As part of the project, more lemmas from the DDO were added to the digital manuscript which today covers 95% of the dictionary. The vocabulary as well as certain sections and lemmas denoting nationality, sexual orientation, gender identity etc. are thoroughly revised due to the change of attitudes towards this vocabulary in the last decade.

Co-authors

Presentation materials

There are no materials yet.