Research Objectives

Four research domains

The individual research projects will target different aspects of longitudinal semantic analysis and when brought together offer a holistic body of knowledge and methodologies. In particular, they will enable us to address semantic change across four research domains allied with the network’s objectives:

1. Design

Theoretical and Practical Innovation in Research Design

The first objective is to assess the impact of research and tool design in longitudinal semantic text analytics, i.e. the environment of design, in order to establish optimal workflows and data visualisations. Under this umbrella, research projects address questions such as how design supports and/or predicates the outcomes of language analysis: How data-driven can a data-driven approach be? What are the varying points at which human knowledge about historical language data can be inserted into data-driven processes or algorithms in order to improve or enhance those processes, for specific research goals? What is the ideal balance between ‘naturally occurring historical language data’ and ‘human intervention/annotation’, for different research goals? 

2. Language

Longitudinal Meaning in Lexis and Grammar

The second objective is to optimise the identification and analysis of semantic change with attention to linguistic variables, in the environment of language across time. Research projects will explore changes in how meaning is constructed and communicated in relation to lexical and grammatical contexts, asking: How can we adopt and adapt lexical and grammatical approaches to meaning in Natural Language Processing established for contemporary linguistic data, to apply them to historical texts? What additional linguistic variables, including grammar, usage, lexis, and spelling, must be accommodated to efficiently analyse meaning in historical texts? How does the variable of time enhance our understanding of other variables in language data, such as meaning change, or the conditional probabilities of lexical, grammatical, and phrasal co-occurrence? 

3. Literature and Text

Longitudinal Meaning in Different Text Types

The third objective is to apply different methods of semantic analysis to questions of text type, genre, and style in the environment of literature across time. Research projects will explore the construction of meaning in relation to domains including the work of a single author, reused text in large historical text collections, and the evolution of different genres, asking questions such as: How are meanings expressed in texts dependent upon text type, genre and style? How can text type, genre, and style be identified automatically in historical texts, in order to inform the automatic identification of meaning? How can knowledge about text types enhance or improve existing methods of distributional semantics or word embedding, or Machine Learning? 

4. History

Longitudinal Meaning in Cultural, Social, Political and Economic Contexts

The fourth objective is to test the performance of state-of-the-art tools for semantic analysis in the domain of history. Research projects will apply multiple tools to one or more longitudinal datasets, pursuing a given research question and offering a detailed account of the tools’ efficacy for historian-led research, suggesting adaptations to refine the automatic detection of meaning in old language data. They will consider questions about how meaning is constructed and communicated in relation to real-world cultural, social, political, economic, and material contexts: What elements of meaning, relating to interactional norms, concepts, and contexts (as construed by historians), can be effectively mapped onto linguistic data for Natural Language Processing (NLP) analysis? Conversely, what elements of NLP analysis can be meaningfully mapped onto humanities research questions about meaning in history? Can we apply knowledge of meaning in use, and knowledge of language in social, cultural and historical contexts to improve existing algorithms and quantitative approaches to meaning in computational linguistics?