2023). “Orbis Annotator: An Open Source Toolkit for the Efficient Annotation and Refinement of Text Corpora”. Proceedings of the 4th Conference on Language, Data and Knowledge (LDK 2023), Vienna, Austria
. (Annotated language data plays an important role in training, fine-tuning and evaluating natural language processing components. Nevertheless, manually annotating language data is still a cumbersome task. This paper presents the Orbis Annotator framework, a user-friendly, easy to install, web-based software that supports users in efficiently annotating language data. Orbis Annotator supports standard and collaborative workflows, reuse of language resources through corpus versioning, and provides built-in tools for assessing corpus quality. In addition, it offers an API which enables the use of different clients (e.g., web-based, command line, etc.) and the use of third-party tools that accelerate the annotation process by pre-annotating corpora. The paper concludes with an evaluation that compares its features to other open-source annotation frameworks and the description of two use cases that outline its use in more sophisticated settings.