Subtopic Deep Dive

Hate Speech Detection on Social Media
Research Guide

What is Hate Speech Detection on Social Media?

Hate Speech Detection on Social Media develops machine learning models and annotation frameworks to identify and classify hate speech on platforms like Twitter.

Researchers address multilingual detection and annotator bias using datasets from SemEval tasks. Basile et al. (2019) organized SemEval-2019 Task 5 for hate speech against immigrants and women in English and Spanish Twitter data (839 citations). Zampieri et al. (2020) presented OffensEval 2020 for multilingual offensive language identification (378 citations).

Curated Papers

Key Challenges

Why It Matters

Models from this subtopic enable platforms to moderate toxicity and inform legal regulations on harmful speech. Malmasi and Zampieri (2017) established lexical baselines for distinguishing hate speech from profanity using supervised classification (281 citations). Kiritchenko et al. (2021) surveyed ethical and human rights issues in abusive language detection, highlighting psychological harm prevention (57 citations). Real-world deployment supports policies on Twitter and similar sites.

Key Research Challenges

Multilingual Detection Variability

Hate speech detection struggles across languages due to linguistic nuances and scarce non-English data. Basile et al. (2019) showed varying performance in SemEval-2019 Task 5 for Spanish and English Twitter hate speech. Zampieri et al. (2020) noted hierarchical taxonomy challenges in OffensEval 2020 across multiple languages.

Annotator Bias in Labeling

Subjective interpretations lead to inconsistent annotations in hate speech datasets. Kiritchenko et al. (2021) discussed ethical concerns in annotation from human rights perspectives. Chiril et al. (2020) demonstrated how reported sexist acts may not be classified as sexist due to context.

Distinguishing Profanity from Hate

Models confuse general profanity with targeted hate speech. Malmasi and Zampieri (2017) applied supervised methods to separate these on social media datasets. Manolescu and Çöltekin (2021) annotated Romanian Twitter data following multilingual guidelines to address this.

Essential Papers

SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter

Valerio Basile, Cristina Bosco, Elisabetta Fersini et al. · 2019 · 839 citations

The paper describes the organization of the SemEval 2019 Task 5 about the detection of hate speech against immigrants and women in Spanish and English messages extracted from Twitter. The task is o...

SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020)

Marcos Zampieri, Preslav Nakov, Sara Rosenthal et al. · 2020 · 378 citations

We present the results and main findings of SemEval-2020 Task 12 on Multilingual Offensive Language Identification in Social Media (OffensEval 2020). The task involves three subtasks corresponding ...

Detecting Hate Speech in Social Media

Shervin Malmasi, Marcos Zampieri · 2017 · 281 citations

In this paper we examine methods to detect hate speech in social media, while distinguishing this from general profanity.We aim to establish lexical baselines for this task by applying supervised c...

Confronting Abusive Language Online: A Survey from the Ethical and Human Rights Perspective

Svetlana Kiritchenko, Isar Nejadgholi, Kathleen C. Fraser · 2021 · Journal of Artificial Intelligence Research · 57 citations

The pervasiveness of abusive content on the internet can lead to severe psychological and physical harm. Significant effort in Natural Language Processing (NLP) research has been devoted to address...

He said “who’s gonna take care of your children when you are at ACL?”: Reported Sexist Acts are Not Sexist

Patricia Chiril, Véronique Moriceau, Farah Benamara et al. · 2020 · 19 citations

International audience

ROFF - A Romanian Twitter Dataset for Offensive Language

Mihai Manolescu, Çağrı Çöltekin · 2021 · 5 citations

This paper describes the annotation process of an offensive language data set for Romanian on social media.To facilitate comparable multi-lingual research on offensive language, the annotation guid...

Profiling Hate Speech Spreaders on Twitter

Francisco Rangel, Berta Chulvi, Gretel Liz De la Peña et al. · 2021 · Zenodo (CERN European Organization for Nuclear Research) · 4 citations

Task Hate speech (HS) is commonly defined as any communication that disparages a person or a group on the basis of some characteristic such as race, colour, ethnicity, gender, sexual orientation, n...

Reading Guide

Foundational Papers

No pre-2015 foundational papers available; start with Malmasi and Zampieri (2017) for lexical baselines on hate vs. profanity detection.

Recent Advances

Usman et al. (2025) on LLM-based multilingual detection; Rangel et al. (2021) on profiling hate spreaders; Doğan et al. (2023) on deceiving detection models for privacy.

Core Methods

SemEval annotation tasks (Basile et al., 2019; Zampieri et al., 2020), supervised classification, hierarchical OLID schema, and LLM fine-tuning (Usman et al., 2025).

How PapersFlow Helps You Research Hate Speech Detection on Social Media

Discover & Search

Research Agent uses searchPapers and exaSearch to find SemEval datasets like Basile et al. (2019), then citationGraph reveals 839 citing works and findSimilarPapers uncovers related multilingual tasks such as Zampieri et al. (2020).

Analyze & Verify

Analysis Agent applies readPaperContent to extract annotation guidelines from Basile et al. (2019), verifies model claims with verifyResponse (CoVe), and runs Python analysis on SemEval metrics using runPythonAnalysis for F1-score statistical verification; GRADE grading assesses evidence strength in offensive language taxonomies.

Synthesize & Write

Synthesis Agent detects gaps in multilingual coverage from papers like Usman et al. (2025), flags contradictions in bias handling; Writing Agent uses latexEditText, latexSyncCitations for Basile et al. (2019), and latexCompile to produce a review paper with exportMermaid diagrams of evaluation pipelines.

Use Cases

"Reproduce SemEval-2019 hate speech F1-scores on Twitter data"

Research Agent → searchPapers (SemEval-2019) → Analysis Agent → readPaperContent (Basile et al.) → runPythonAnalysis (NumPy/pandas on metrics) → researcher gets plotted confusion matrices and verified baselines.

"Draft LaTeX survey on multilingual hate detection challenges"

Synthesis Agent → gap detection (Zampieri et al. 2020 gaps) → Writing Agent → latexEditText (add sections) → latexSyncCitations (10 papers) → latexCompile → researcher gets compiled PDF with cited SemEval tasks.

"Find GitHub repos for OffensEval 2020 models"

Research Agent → searchPapers (OffensEval) → Code Discovery → paperExtractUrls (Zampieri et al.) → paperFindGithubRepo → githubRepoInspect → researcher gets inspected code, datasets, and training scripts.

Automated Workflows

Deep Research workflow conducts systematic review of 50+ SemEval-related papers, chaining searchPapers → citationGraph → structured report on hate detection evolution. DeepScan applies 7-step analysis with CoVe checkpoints to verify Basile et al. (2019) results against modern LLMs like Usman et al. (2025). Theorizer generates theory on annotator bias propagation from Kiritchenko et al. (2021) and Chiril et al. (2020).

Try Doxa for Hate Speech Detection on Social Media Research

Frequently Asked Questions

What is Hate Speech Detection on Social Media?

It uses ML models to classify hate speech on platforms like Twitter, addressing multilingual and bias issues (Basile et al., 2019).

What are key methods in this subtopic?

Supervised classification on SemEval tasks distinguishes hate from profanity; hierarchical taxonomies handle offensive language (Zampieri et al., 2020; Malmasi and Zampieri, 2017).

What are major papers?

Basile et al. (2019, 839 citations) on SemEval-2019; Zampieri et al. (2020, 378 citations) on OffensEval; Malmasi and Zampieri (2017, 281 citations) on baselines.

What are open problems?

Ethical annotation biases, low-resource languages, and deployment privacy issues persist (Kiritchenko et al., 2021; Doğan et al., 2023).

Research Freedom of Expression and Defamation with AI

PapersFlow provides specialized AI tools for Social Sciences researchers. Here are the most relevant for this topic:

Systematic Review

AI-powered evidence synthesis with documented search strategies

AI Literature Review

Automate paper discovery and synthesis across 474M+ papers

Deep Research Reports

Multi-source evidence synthesis with counter-evidence

Find Disagreement

Discover conflicting findings and counter-evidence

See how researchers in Social Sciences use PapersFlow

Field-specific workflows, example queries, and use cases.

Social Sciences Guide

Start Researching Hate Speech Detection on Social Media with AI

Search 474M+ papers, run AI-powered literature reviews, and write with integrated citations — all in one workspace.

Try PapersFlow Free See AI Literature Review

See how PapersFlow works for Social Sciences researchers

Part of the Freedom of Expression and Defamation Research Guide