Speakers
Description
Corpus Pattern Analysis, CPA, is a technique for identifying local semantic and syntactic information of a word and map it to its meanings. In verbs, it consists basically of the argument structure labelled with semantic types for each argument. CPA is used in several dictionary projects and allows systematic corpus analysis; however, it is extremely time-consuming. In this paper, we present a method for automatic pattern identification of Spanish verbs in corpora. We used a syntactic parser for dependency analysis (Stanza), applied a NER tagger from the Flair NLP framework for named entity recognition, and for common nouns, we implemented a semantic tagger and a word sense disambiguation method, both created for the task. All resources were combined to extract CPA verb patterns. The method performs better than previous attempts and can contribute to a
more efficient pattern-based lexicography.