Speakers
Description
Despite the decreasing use of regional and local varieties of the Dutch language, there is a growing public interest in dialects in the Netherlands and Flanders. Several dialect associations strive to preserve the local dialect by creating lexicons, establishing spelling conventions, writing texts in their local dialect, teaching the dialect, and sharing knowledge about their local dialect with the general public. This work is done by dialect enthusiasts, volunteers, with limited resources and very limited technical support. In 2021, the Dutch Language Union decided to find ways to support this work and asked the Dutch Language Institute to explore the possibility to develop and maintain a lexical infrastructure that could not only help to create lexical resources but also provide means to make the data accessible to dialect users and learners. It was decided to carry out this exploration in the form of a pilot project for Bildts. There is a very active and lively language community in Bildt as evidenced by a report on Frisian-Dutch contact varieties in Friesland, which highlights the desires and requirements of smaller language variations. The pilot should result in an infrastructure that not only meets the requirements of the Bildts language community but also lays a foundation for future infrastructure development for other dialects. Ultimately the intended dialect infrastructure should serve as the primary resource for users of often small language varieties. Providing such an infrastructure will not only streamline the inventory and description of language varieties but also facilitate users’ search for information on words, spelling, and grammar. The requirements for such an infrastructure for Bildts were formulated by a steering group. It was decided to focus on written dialect data and the consensus was that a lexical database was the first priority. The aim: enabling people to learn Bildts.
Lexical Data Editing Environment The underlying concept is as follows: a list of words in Dutch is paired with their corresponding word in Bildts. The Dutch word list is based on sources such as Hazenberg et al. (1992), meant for language learners at level B1. For Bildts, an existing dictionary (Buwalda et al., 2013) is utilized, with its content immutable. In cases where words are absent from Buwalda, the platform permits the addition of new words, automatically categorizing them as part of a new lexicon (Woordeboek Bildts Aigene (WBA)). This approach enhances both the suitability of the lexical resource for dialectal language production and the comparability of lexical resources across different dialects within the infrastructure. It represents an initial step towards establishing a comprehensive onomasiological resource. The WBA includes an editing environment, with all editing operations being logged for quality control purposes. During the presentation, we will demonstrate the various steps involved in the editing and linking process. Publication platform Following the editing phase, the data will be made available online on the websites of the collaborating organizations. The data will be accessed through a search facility based on the Woordwaark platform which will be adapted by INT to fit the data and requirements for Bildts. Woordwaark utilizes a PostgreSQL database to store all relevant data. The website is created using JavaScript and Node.js to provide an interactive experience. In addition to the website, it will also be possible to search the dictionaries using a JSON API, which is useful if other websites or services want to integrate with Woordwaark. The entire system is packaged into one or more Docker images, making it relatively easy to deploy in different environments.
Documentation At the end of the pilot phase, extensive documentation will be made available for use by other regional languages interested in adopting the digital infrastructure. This documentation will include a workflow description, a user manual, and a report detailing the pilot experiences. We will identify best practices, necessary partners, and their respective roles and tasks, aiming to provide better assistance to other language varieties seeking to utilize the provision.