8–12 Oct 2024
Hotel Croatia
Europe/Warsaw timezone

Virtual Lexicographic Laboratory as a Tool for Extracting Linguistic Knowledge from the Dictionary Text

9 Oct 2024, 17:00
1h 30m
Tihi salon (Hotel Croatia)

Tihi salon

Hotel Croatia


Iryna Ostapova Yevhen Kupriianov Mykyta Yablochkov


This paper shows research potential of the virtual lexicographic laboratory VLL DLE 23 based on the text of the Spanish Explanatory Dictionary (DLE 23). Virtual Lexicographic Laboratories (VLL) is the effective tools for linguistic researches based on dictionaries. The lexicographic text is considered not only as a basis for dictionary creating and updating but also as a means of professional communication and transfer of linguistic knowledge. Primarily, this applies to explanatory dictionaries, which are characterized by a detailed and multi-aspect language units description. This arises the problem of providing such dictionaries with appropriate tools, that enable to extract any linguistic information from the text of the dictionary during linguistic research. This paper describes the experience of creating such tools during the implementation of the VLL DLE 23 project, a Virtual Lexicographic Laboratory based on the Spanish Dictionary “Diccionario de la lengua española. 23ª edición” (https://dle.rae.es/), published by the Royal Spanish Academy. The current version of the VLL DLE 23 can be accessed at https://svc2.ulif.org.ua/Dics/ResIntSpanish. The VLL DLE 23 project was implemented in three stages. At the first stage, the text along with the HTML-markup was (partially) extracted from the available online version of the dictionary. At the second stage, dictionary text was analyzed in order to identify the informational elements of the entries. At the third stage, a model of the L-system was built, which formally displays DLE 23 information elements and serves as the basis for creating a database and interface. The current interface enables to generate statistics for the entire dictionary or for a certain selection of dictionary entries, to conduct linguistic researches of lexical meanings, etymology, grammar, and the peculiarities of the Spanish language units usage, as well as to create derivative dictionaries based on DLE 23, for example: dictionary of morphemes, dictionary of homonyms, dictionary of word combinations etc. The VLL DLE 23 interface provides the following modes of work with the dictionary: a) dictionary register; b) dictionary entry profile; c) full-text search. The dictionary register allows user to select a headword either by clicking on it in the list or by typing a sequence of characters that exactly matches the word or word combinations they are looking for. Work with the dictionary registry is provided by filters, such as “starts with”, “ends with”, “exactly”, “contains”. This mode resembles working with the online version of the dictionary. Dictionary entry profile is a mode of VLL operation in which it is possible to create samples of dictionary entries, for which user activates the dictionary entries elements and selects the meanings of these elements. The user can select a specific type and structure of headwords by checking the appropriate box. There is also an additional option to include homonymous words in the sample. The current version allows to study: 1) types and amount of words in the dictionary list: morphemes (prefixes and affixes), words and word combinations; 2) word-forming characteristics, i.e., masculine and feminine forms, headword doublets; 3) phenomena of unambiguity, ambiguity, and homonymy of headwords; 4) Spanish language vocabulary by origin (specific and borrowed vocabulary); 5) presence or absence of word combinations, formed with headwords. Figure 1 shows the options that must be selected to a sample of dictionary entries containing register words of foreign origin. Full-text search. This mode is used to select dictionary entries by certain elements of the DLE 23 metalanguage or by a certain text fragment. It is also possible to make a selection of dictionary entries containing one or another fragment of the explanation. Full-text search can be used with “Dictionary entry profile” mode. Process of expanding the research potential of the VLL DLE 23 is ongoing. In the near future, it is planned to index the remaining elements of the dictionary entries discovered at the previous stage of the project implementation. For working with the dictionary text in a digital environment, it is necessary for all of its informational elements, which may be of interest to the linguist during their research, to be accessible.


Presentation materials

There are no materials yet.