The news media landscape tends to focus on long-running narratives. Correctly processing new information, therefore, requiresconsidering multiple lenses when analyzing media content. Traditionally it would have been considered sufficient to extract thetopics or entities contained in a text in order to classify it, but to-day it is important to also look at more sophisticated annotationsrelated to fine-grained geolocation, events, stories and the relationsbetween them. In order to leverage such lenses we propose a newcorpus that offers a diverse set of annotations over texts collectedfrom multiple media sources. We also showcase the frameworkused for creating the corpus, as well as how the information fromthe various lenses can be used in order to support different usecases in the EU project InVID for verifying the veracity of onlinevideo.
Keywords: Corpus, Named Entity Linking, Geosemantics, Event detection,Information Extraction, Natural Language Processing, Fake news