The shift of communication to online platforms not only brings social and economic benefits, like the opportunity to share opinions, discuss the hottest topics, get immediate feedback, and create new business opportunities, it can also create a new space for malicious behavior, such as spreading hateful comments, bullying, or usage of rude and obscene language. Therefore, detecting malicious behavior is crucial. For example, the German law NetzDG requires social media providers like Facebook and Google, among others, to remove posts with obviously illegal content within 24 hours and report on their progress every six months. The mass of postings shared on social media call for an algorithmic approach to screen and flag potentially harmful content. Corresponding screening technologies help social media providers to comply with legislation, raise the efficiency of content moderation, and contribute to social welfare by preventing the spread of malicious user generated content.
In this paper, Elizaveta Zinovyeva, Wolfgang Karl Härdle, and BCCP Senior Fellow Stefan Lessmann investigate the potential of deep learning for detecting antisocial online behavior (AOB). They propose AOB as an umbrella term for using rude, hateful, sexist, and/or racist textual content in communication. Using datasets from different social media platforms, the authors show that deep learning models, especially deep pre-trained transformers, almost always outperform traditional machine learning methods independent of the dataset structure and retrieval process. To derive policy recommendations, the paper also investigates the determinants of deep learning success and finds that the marginal utility of computationally heavy deep learning algorithms decreases with the prevalence of AOB in a training data set. Assuming that AOB is – fortunately – a relatively rare event in the plethora of text-based social media communication, this result implies that platform providers will benefit from, and should employ, advanced deep learning techniques when deploying content screening systems.
Moreover, the authors raise the question of detection model explainability. Machine learning – especially deep learning methods – are often considered "black boxes." Legal frameworks, like the EU data protection act, and research into technology acceptance enforce the use of interpretable models for decision support. Against this background, the authors demonstrate the use of local interpretation methods that facilitate clarifying the decision logic of an AOB detection model. The authors suggest the model interpretation component of their framework raises acceptance among users and regulators and contributes to identifying unintended bias in algorithmic recommendations.
The full paper “Antisocial Online Behavior Detection Using Deep Learning” is published in Decision Support Systems. A pre-print together with codes is available in the discussion paper series of the IRTG1792