
Modelling context for the analysis of language variation and change
Universität des Saarlandes (Germany)
Objectives: Context has a major impact on how we process language. However, the notion of context is very broad ranging from broadly conceived pragmatic conditions, i.e. the extra-linguistic context (e.g. socio-cultural factors, genres, time), to the relationship among linguistic elements that can substitute for each other in a given context, i.e. the paradigmatic context, up to the local linguistic context of linguistic elements, i.e. the syntagmatic context. While studies on language variation and change do encompass the notion of context, the coverage of contextual factors is often relatively limited. This project will apply and further develop computational modelling techniques to integrate contextual factors among the types of context described above to arrive at more comprehensive accounts of effects of contextual factors on language variation and change. Sofia will 1) assess the suitability of selected probabilistic methods (word/neural embeddings, information-theoretic models, etc.) for diachronic linguistic analysis given contextual factors; 2) undertake a systematic study and modelling evaluation of different datasets (provided by USAAR Clarin-D repository) according to different types of contextual factors.
Expected Results:
1) Computational methods for diachronic text analysis.
2) Thorough assessment of computational methods for specific types of data and contextual factors. Sofia will present research at D7.1, D7.4.
Planned secondments: VARIENG (Year 3, 4 months; approaches to language variation and change). Also visiting: Netherlands eScience Center (training on deep learning); KULEUVEN (text analytics for lexical and syntactic variation).
