Nature human behaviour

Multimodal large language models can make context-sensitive hate speech evaluations aligned with human judgement.

Thomas Davidson

Published: 202510.1038/s41562-025-02360-w

Abstract

Multimodal large language models (MLLMs) could enhance the accuracy of automated content moderation by integrating contextual information. This study examines how MLLMs evaluate hate speech through a series of conjoint experiments. Models are provide…

Preview only. Read the full abstract at the source

View at DOI