Nov 17 – 20, 2025
Bled, Slovenia
Europe/Ljubljana timezone

DMLEX on Wikibase: Legacy dictionaries as collaboratively editable dataset

Nov 20, 2025, 11:00 AM
30m
Arnold hall

Arnold hall

Speakers

Simon Krek Primož Ponikvar Andraž Repar Iztok Kosem David Lindemann

Description

This paper presents an experimental workflow for converting legacy digitized dictionaries into the DMLex standard and subsequently importing them into a Wikibase instance. DMLex, a serialization-independent model developed by the OASIS LEXIDMA Technical Committee, aims to provide a universal and modular representation of lexicographic data. The study tested whether dictionaries from heterogeneous sources—originally encoded in internal XML formats—could be reliably transformed into DMLex-compliant representations and repurposed for collaborative editing and enrichment on a structured linked data platform. The transformation was achieved through a combination of rule-based scripts, manual refinement, and large language model assistance. While DMLex proved adaptable to a wide range of lexical phenomena, several limitations became apparent during the Wikibase integration phase. These findings suggest that practical deployment of DMLex benefits from clearer conventions and validation strategies when applied beyond theoretical modeling. The results confirm DMLex’s potential for future-proof dictionary modeling, while also highlighting areas where further specification and community consensus are needed to support its application in digital infrastructures and collaborative environments.

Presentation materials

There are no materials yet.