8–12 Oct 2024
Hotel Croatia
Europe/Warsaw timezone

The Case of Processing Multi-Word Expressions in English, Croatian and Slovene

9 Oct 2024, 17:00
1h 30m
Tihi salon (Hotel Croatia)

Tihi salon

Hotel Croatia

Speakers

Mojca Kompara Lukančić Frane Malenica Emilija Mustapić Jelena Gugić Jakov Proroković

Description

Multi-word expressions are a heterogeneous linguistic category which constitutes a significant part of everyday communication and they include linguistic constructions consisting of more than one word, such as idioms (e.g., kick the bucket), binomial expressions (e.g., bread and butter), phrasal verbs (e.g., turn on/off), fixed/conventionalized expressions (e.g., have a nice day) and collocations (e.g., social media) (Wray, 2002; Schlücker, 2019). In this paper, we present a part of the data collected within the scope of the project Procesiranje višerječnih izraza u engleskom kao stranom jeziku (Eng. ‘Processing multiword expressions in English as a foreign language’), the aim of which is to examine and compare the processing strategies in English as L2 and Croatian and Slovene as L1. We analyse the data on native and non-native processing of nominal compounds in the mentioned pairs of languages by investigating the effects of morphological relatedness, frequency and size of morphological family/pattern (Schreuder & Baayen, 1997, Mattiello & Dressler, 2022) and L2 proficiency (Shantz, 2017). In order to compare the factors which affect native and non-native processing of compounds, we will conduct two experiments with masked priming lexical decision tasks (Forster & Davis, 1984) with native speakers of Croatian and Slovene with moderate or high proficiency in English as L2 (for similar research, see Clahsen et al., 2013; De Cat et al., 2014; 2015; González Alonso et al., 2016). In the first experiment, we examine the potential effects of morphological relatedness in nominal compounds, and in the second experiment, we examine the potential effects of schematicity/size of pattern, i.e., whether compounds whose right constituents are merged with a higher number of left constituents are processed faster in L1 and L2. The collection of corpus data that the research relies on has been carried out using the tools available in the Sketch Engine family of language corpora. The initial phase of the research involved the identification and extraction of compounds used in the experiment, with around 50 000 word forms which had to be filtered after having been identified in the corpora based on the morphologically-conditioned extraction criteria. The selection criteria for the corpora encompassed factors such as the target language, corpus size, relevance, recency, and the nature of texts included. Finally, approximately 2000 Croatian and Slovenian compound examples were retained after having been extracted and filtered out from the CLASSLA web corpora (Ljubešić et al., 2024a; 2024b). English compounds were sourced from the ukWaC corpus and only 10 000 of the most frequent ones were retained. Based on the corpus data, we determined the frequency of individual compound constituents and the size of pattern/series used as relevant variables in the experiments. The data on L2 proficiency will be obtained using a proficiency test developed for the purpose of this research. The research presented in this paper aims to answer the following research questions: 1. Does morphological decomposition occur with nominal compounds in Croatian and Slovene as L1 and English as L2? 2. Do frequency measures (frequency of individual constituents, schematicity) affect processing in L1 and L2 and to what extent? 3. Is the effect of L2 proficiency modulated by frequency measures, i.e., are speakers of different L2 proficiency affected by frequency measures in a different manner?

Co-authors

Presentation materials

There are no materials yet.