Engineering Crowdsourced Stream Processing Systems

Carlos Castillo; Ioanna Lykourentzou; Muhammad Imran; Yannick Naudet

arxiv: 1310.5463 · v3 · pith:PMXUWERCnew · submitted 2013-10-21 · 💻 cs.DB · cs.AI· cs.SE

Engineering Crowdsourced Stream Processing Systems

Muhammad Imran , Ioanna Lykourentzou , Yannick Naudet , Carlos Castillo This is my paper

classification 💻 cs.DB cs.AIcs.SE

keywords processingsystemstreamsystemsdatadesigncrowdsourcedhuman

0 comments

read the original abstract

A crowdsourced stream processing system (CSP) is a system that incorporates crowdsourced tasks in the processing of a data stream. This can be seen as enabling crowdsourcing work to be applied on a sample of large-scale data at high speed, or equivalently, enabling stream processing to employ human intelligence. It also leads to a substantial expansion of the capabilities of data processing systems. Engineering a CSP system requires the combination of human and machine computation elements. From a general systems theory perspective, this means taking into account inherited as well as emerging properties from both these elements. In this paper, we position CSP systems within a broader taxonomy, outline a series of design principles and evaluation metrics, present an extensible framework for their design, and describe several design patterns. We showcase the capabilities of CSP systems by performing a case study that applies our proposed framework to the design and analysis of a real system (AIDR) that classifies social media messages during time-critical crisis events. Results show that compared to a pure stream processing system, AIDR can achieve a higher data classification accuracy, while compared to a pure crowdsourcing solution, the system makes better use of human workers by requiring much less manual work effort.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Modeling Human Annotation Errors to Design Bias-Aware Systems for Social Stream Processing
cs.SI 2019-07 unverdicted novelty 5.0

Annotation quality for social media crisis posts depends on presentation order to humans, and an active learning algorithm can mitigate some resulting errors to improve classifier accuracy.