8–12 Oct 2024
Hotel Croatia
Europe/Warsaw timezone

Towards the Automatic Generation of a Pattern-Based Dictionary of Spanish Verbs

11 Oct 2024, 10:00
30m
Bobara Hall (Hotel Croatia)

Bobara Hall

Hotel Croatia

Speakers

Irene Renau Rogelio Nazar Daniel Mora Melanchthon

Description

Corpus Pattern Analysis, CPA, is a technique for identifying local semantic and syntactic information of a word and map it to its meanings. In verbs, it consists basically of the argument structure labelled with semantic types for each argument. CPA is used in several dictionary projects and allows systematic corpus analysis; however, it is extremely time-consuming. In this paper, we present a method for automatic pattern identification of Spanish verbs in corpora. We used a syntactic parser for dependency analysis (Stanza), applied a NER tagger from the Flair NLP framework for named entity recognition, and for common nouns, we implemented a semantic tagger and a word sense disambiguation method, both created for the task. All resources were combined to extract CPA verb patterns. The method performs better than previous attempts and can contribute to a
more efficient pattern-based lexicography.

Co-authors

Presentation materials

There are no materials yet.