Subtopic Deep Dive

Targeted Hate Speech Classification
Research Guide

What is Targeted Hate Speech Classification?

Targeted Hate Speech Classification categorizes hate speech by specific targeted groups such as race, gender, religion, or ethnicity using fine-grained taxonomies, entity recognition, and hierarchical classifiers.

This subtopic addresses distinguishing hate speech from offensive language and identifying targets like immigrants, women, or racial groups. Key datasets include those from SemEval-2019 Task 5 (Basile et al., 2019, 839 citations) targeting immigrants and women, and OLID from OffensEval (Zampieri et al., 2019, 673 citations). Over 20 papers from 2012-2019 focus on datasets, bias, and classification challenges.

15
Curated Papers
3
Key Challenges

Why It Matters

Targeted classification enables platform-specific moderation policies, as in Burnap and Williams (2015, 581 citations) modeling Twitter cyber hate for decision-making. It mitigates racial bias in detectors, per Sap et al. (2019, 737 citations) showing dialect insensitivity harms minorities. Applications include tailored interventions in social media, reducing disinformation spread (Tucker et al., 2018, 1129 citations).

Key Research Challenges

Distinguishing Hate from Offensive Language

Lexical methods suffer low precision by classifying all offensive content as hate (Davidson et al., 2017, 2347 citations). Subtle language variations challenge binary classifiers (MacAvaney et al., 2019, 541 citations). Surveys highlight definitional inconsistencies across datasets (Schmidt and Wiegand, 2017, 1340 citations).

Racial and Dialect Bias in Models

Annotator insensitivity to dialects correlates surface features with hate, amplifying minority harm (Sap et al., 2019, 737 citations). Crowdsourced datasets show skewed abusive behavior labeling (Founta et al., 2018, 560 citations). Bias persists in embeddings from comment data (Djuric et al., 2015, 714 citations).

Fine-Grained Target Identification

Tasks require sub-classifying hate against immigrants or women in multilingual Twitter data (Basile et al., 2019, 839 citations). Hierarchical offensive language categorization struggles with implicit targeting (Zampieri et al., 2019, 673 citations). Early work targeted specific groups like Blacks but lacked broad taxonomies (Kwok and Wang, 2013, 452 citations).

Essential Papers

1.

Automated Hate Speech Detection and the Problem of Offensive Language

Thomas Davidson, Dana Warmsley, Michael W. Macy et al. · 2017 · Proceedings of the International AAAI Conference on Web and Social Media · 2.3K citations

A key challenge for automatic hate-speech detection on social media is the separation of hate speech from other instances of offensive language. Lexical detection methods tend to have low precision...

2.

A Survey on Hate Speech Detection using Natural Language Processing

Anna Grau Schmidt, Michael Wiegand · 2017 · 1.3K citations

This paper presents a survey on hate speech detection. Given the steadily growing body of social media content, the amount of online hate speech is also increasing. Due to the massive scale of the ...

3.

Social Media, Political Polarization, and Political Disinformation: A Review of the Scientific Literature

Joshua A. Tucker, Andrew M. Guess, Pablo Barberá et al. · 2018 · SSRN Electronic Journal · 1.1K citations

4.

SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter

Valerio Basile, Cristina Bosco, Elisabetta Fersini et al. · 2019 · 839 citations

The paper describes the organization of the SemEval 2019 Task 5 about the detection of hate speech against immigrants and women in Spanish and English messages extracted from Twitter. The task is o...

5.

The Risk of Racial Bias in Hate Speech Detection

Maarten Sap, Dallas Card, Saadia Gabriel et al. · 2019 · 737 citations

We investigate how annotators’ insensitivity to differences in dialect can lead to racial bias in automatic hate speech detection models, potentially amplifying harm against minority populations. W...

6.

Hate Speech Detection with Comment Embeddings

Nemanja Djuric, Jing Zhou, Robin K. Morris et al. · 2015 · 714 citations

We address the problem of hate speech detection in online user comments. Hate speech, defined as an "abusive speech targeting specific group characteristics, such as ethnicity, religion, or gender"...

7.

SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval)

Marcos Zampieri, Shervin Malmasi, Preslav Nakov et al. · 2019 · 673 citations

We present the results and the main findings of SemEval-2019 Task 6 on Identifying and Categorizing Offensive Language in Social Media (OffensEval). The task was based on a new dataset, the Offensi...

Reading Guide

Foundational Papers

Start with Warner and Hirschberg (2012, 513 citations) for core hate definition targeting groups; Kwok and Wang (2013, 452 citations) for race-specific detection on Twitter.

Recent Advances

Study Basile et al. (2019, 839 citations) for immigrant/women tasks; Sap et al. (2019, 737 citations) for bias risks; Zampieri et al. (2019, 673 citations) for offensive hierarchies.

Core Methods

Core techniques: lexical features and SVM (Davidson et al., 2017); word embeddings (Djuric et al., 2015); multilingual BERT fine-tuning (Basile et al., 2019); hierarchical classification (Zampieri et al., 2019).

How PapersFlow Helps You Research Targeted Hate Speech Classification

Discover & Search

Research Agent uses searchPapers and exaSearch to find SemEval datasets (Basile et al., 2019), then citationGraph reveals 839 citing works on immigrant-targeted hate, while findSimilarPapers uncovers related bias studies like Sap et al. (2019).

Analyze & Verify

Analysis Agent applies readPaperContent to extract taxonomies from Zampieri et al. (2019), verifies classifier F1-scores via verifyResponse (CoVe), and runs Python analysis with scikit-learn to reimplement Davidson et al. (2017) lexical features, graded by GRADE for evidence strength.

Synthesize & Write

Synthesis Agent detects gaps in target taxonomies across papers, flags contradictions in bias claims, and uses latexEditText with latexSyncCitations to draft hierarchical classifier reviews; Writing Agent compiles via latexCompile and exportMermaid for taxonomy diagrams.

Use Cases

"Reproduce hate speech classifier from Davidson 2017 with modern embeddings"

Research Agent → searchPapers('Davidson 2017') → Analysis Agent → readPaperContent → runPythonAnalysis (BERT embeddings vs. lexical, pandas metrics) → researcher gets F1 comparison plot and code.

"Draft SemEval-2019 hate target survey in LaTeX"

Research Agent → citationGraph('Basile 2019') → Synthesis → gap detection → Writing Agent → latexEditText + latexSyncCitations + latexCompile → researcher gets formatted PDF with 50+ citations.

"Find GitHub code for OLID offensEval classifiers"

Research Agent → searchPapers('Zampieri 2019') → Code Discovery (paperExtractUrls → paperFindGithubRepo → githubRepoInspect) → researcher gets repo code, models, and training scripts.

Automated Workflows

Deep Research workflow scans 50+ papers via searchPapers on 'targeted hate speech', structures reports with bias challenges from Sap et al. (2019). DeepScan applies 7-step CoVe to verify multilingual classifiers (Basile et al., 2019) with GRADE checkpoints. Theorizer generates hypotheses on dialect bias mitigation from Kwok and Wang (2013) to recent works.

Frequently Asked Questions

What defines targeted hate speech classification?

It categorizes hate by specific groups like race or gender using taxonomies and entity recognition, distinguishing from general offensive language (Davidson et al., 2017).

What are key methods in this subtopic?

Methods include comment embeddings (Djuric et al., 2015), SemEval multilingual tasks (Basile et al., 2019), and offensEval hierarchies (Zampieri et al., 2019).

What are the most cited papers?

Davidson et al. (2017, 2347 citations) on offensive language problems; Schmidt and Wiegand (2017, 1340 citations) survey; Sap et al. (2019, 737 citations) on racial bias.

What open problems remain?

Challenges include bias in dialect handling (Sap et al., 2019), subtle language detection (MacAvaney et al., 2019), and consistent taxonomies across languages.

Research Hate Speech and Cyberbullying Detection with AI

PapersFlow provides specialized AI tools for Computer Science researchers. Here are the most relevant for this topic:

See how researchers in Computer Science & AI use PapersFlow

Field-specific workflows, example queries, and use cases.

Computer Science & AI Guide

Start Researching Targeted Hate Speech Classification with AI

Search 474M+ papers, run AI-powered literature reviews, and write with integrated citations — all in one workspace.

See how PapersFlow works for Computer Science researchers