RECAST: Interactive Auditing of Automatic Toxicity Detection Models

Austin P. Wright

Omar Shaikh

Haekyu Park

Will Epperson

Muhammed Ahmed

Stephane Pinel

Diyi Yang

Duen Horng (Polo) Chau

The Eighth International Workshop of Chinese CHI (ChiCHI), 2020

Abstract

As toxic language becomes nearly pervasive online, there has been increasing interest in leveraging the advancements in natural language processing (NLP), from very large transformer models to automatically detecting and removing toxic comments. Despite the fairness concerns, lack of adversarial robustness, and limited prediction explainability for deep learning systems, there is currently little work for auditing these systems and understanding how they work for both developers and users. We present our ongoing work, RECAST, an interactive tool for examining toxicity detection models by visualizing explanations for predictions and providing alternative wordings for detected toxic speech.

Materials

Project

PDF

BibTeX

					
@article{wright2020recast,
  title={RECAST: Interactive Auditing of Automatic Toxicity Detection Models},
  author={Austin P. Wright and Omar Shaikh and Haekyu Park and Will Epperson and Muhammed Ahmed and Stephane Pinel and Diyi Yang and Duen Horng (Polo) Chau},
  year={2020},
  eprint={2001.01819},
  archivePrefix={arXiv},
  primaryClass={cs.CL}
}