Generic High-Throughput Methods for Multilingual Sentiment Detection

Citation

Gindl, Stefan, Scharl, Arno and Weichselbraun, Albert. (2010). Generic High-Throughput Methods for Multilingual Sentiment Detection. 4th IEEE International Conference on Digital Ecosystems and Technologies, Dubai, United Arab Emirates

Abstract

Digital ecosystems typically involve a large number of participants from different sectors who generate rapidly growing archives of unstructured text. Measuring the frequency of certain terms to determine the popularity of a topic is comparably straightforward. Detecting sentiment expressed in user-generated electronic content is more challenging, especially in the case of digital ecosystems comprising heterogeneous sets of multilingual documents. This paper describes the use of language-specific grammar patterns and multilingual tagged dictionaries to detect sentiment in German and English document repositories. Digital ecosystems may contain millions of frequently updated documents, requiring sentiment detection methods that maximize throughput. The ideal combination of high-throughput techniques and more accurate (but slower) approaches depends on the specific requirements of an application. To accommodate a wide range of possible applications, this paper presents (i) an adaptive method, balancing accuracy and scalability of multilingual textual sources, (ii) a generic approach for generating language-specific grammar patterns and multilingual tagged dictionaries, and (iii) an extensive evaluation verifying the method's performance based on Amazon product reviews and user evaluations from Sentiment Quiz, a game-with-a-purpose that invites users of the Facebook social networking platform to assess the sentiment of individual sentences.

Downloads and Resources

  1. Reference (BibTex)
  2. Full Article