LaTeCH-CLfL 2017 will put in the same room two events with a similar research focus and with some tradition: the SIGHUM Workshops on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH) and the ACL Workshops on Computational Linguistics for Literature (CLfL). The LaTeCH workshop series has become a forum for researchers who develop new technologies for improved information access to data from the broadly understood humanities and social sciences. Since the formation of SIGHUM (ACL Special Interest Group on Language Technologies for the Socio-Economic Sciences and Humanities), the LaTeCH workshop has also been the venue for the SIGHUM annual research and business meeting. The CLfL workshops have focussed on applications of NLP to a wide variety of literary data. This joint event will bring together researchers from both research communities. We hope to broaden the scope, stimulate more collaboration and open new research perspectives.
Scope and Topics
In the Humanities, Social Sciences, and Cultural Heritage communities, there is increasing interest in and demand for NLP methods for semantic annotation, intelligent linking, discovery, querying, cleaning and visualization of both primary and secondary data; this is even true of primarily non-textual collections, given that text is also the pervasive medium for metadata. Such applications pose new challenges for NLP research, such as noisy, non-standard textual or multi-modal input, historical languages, multilingual parts within one document, lack of digital resources, or resource-intensive approaches that call for (semi-)automatic processing tools and domain adaptation, or, as a last resort, intense manual effort (e.g., annotation). Literary texts bring their own problems, because navigating this form of creative expression requires more than the typical information-seeking tools. Examples of advanced tasks include the study of literature of a certain period or sub-genre, recognition of certain literary devices, or quantitative analysis of poetry. More generally, there is a growing interest in computational models whose results can be interpreted in meaningful ways.
A common forum is mutually beneficial for NLP experts, data specialists, digital humanities researchers, and those who study literature. The first edition of the joint workshop has something for everyone in all these communities. We invite contributions on these, and closely related, topics:
- adapting NLP tools to Cultural Heritage, Social Sciences, and to the humanities including literature;
- fully- or semi-automatic creation of semantic resources;
- automatic error detection and cleaning of textual data;
- building and analyzing social networks of literary characters;
- complex annotation schemas, tools and interfaces;
- dealing with linguistic variation and non-standard or historical use of language;
- discourse and narrative analysis/modelling, notably in literature;
- emotion analysis for the humanities and for literature;
- generation of literary narrative, dialogue or poetry;
- identification and analysis of literary genres;
- linking and retrieving information from different sources, media, and domains;
- modelling dialogue literary style for generation;
- modelling of information and knowledge in the Humanities, Social Sciences, and Cultural Heritage;
- profiling and authorship attribution;
- research infrastructure and standardisation efforts in the Humanities, Social Sciences, and Cultural Heritage;
- searching for scientific and/or scholarly literature.