In the Humanities, Social Sciences, and Cultural Heritage communities, there is increasing interest in, and demand for, NLP methods for semantic annotation, intelligent linking, discovery, querying, cleaning and visualization of both primary and secondary data. This is even true of primarily non-textual collections, given that text is also the pervasive medium for metadata. Such applications pose new challenges for NLP research: noisy, non-standard textual or multi-modal input, historical languages, vague research concepts, multilingual parts within one document, and so no. One also encounters the insufficiency of digital resources; resource-intensive approaches call for (semi-)automatic processing tools and domain adaptation, or, as a last resort, intense manual effort (e.g., annotation).
Literary texts bring their own problems, because navigating this form of creative expression requires more than the typical information-seeking tools. Examples of advanced tasks include the study of literature of a certain period or sub-genre, recognition of certain literary devices, or quantitative analysis of poetry.
More generally, there is a growing interest in computational models whose results can be interpreted in meaningful ways. It is, therefore, of mutual benefit that NLP experts, data specialists and Digital Humanities researchers who work in and across their domains get involved in the Computational Linguistics community and present their fundamental or applied research results. It has already been demonstrated how cross-disciplinary exchange not only supports work in the Humanities, Social Sciences, and Cultural Heritage communities but also promotes work in the Computational Linguistics community to build richer and more effective tools and models.
Topics of interest include, but are not limited to, the following:
- adaptation of NLP tools to Cultural Heritage, Social Sciences, and to the Humanities including literature;
- automatic error detection and cleaning of textual data;
- complex annotation schemas, tools and interfaces;
- creation (fully- or semi-automatic) of semantic resources;
- creation and analysis of social networks of literary characters;
- discourse and narrative analysis/modelling, notably in literature;
- emotion analysis for the humanities and for literature;
- generation of literary narrative, dialogue or poetry;
- identification and analysis of literary genres;
- linking and retrieving information from different sources, media, and domains;
- modelling dialogue literary style for generation;
- modelling of information and knowledge in the Humanities, Social Sciences, and Cultural Heritage;
- profiling and authorship attribution;
- search for scientific and/or scholarly literature;
- work with linguistic variation and non-standard or historical use of language.
Authors will be invited to submit papers on original, unpublished work in any topic area of the workshop. Moreover, we will solicit demos that will present specific solutions, user scenarios and use cases. The workshop will also be the venue of the annual SIGHUM business meeting.