Nature human behaviour
Multimodal large language models can make context-sensitive hate speech evaluations aligned with human judgement.
Thomas Davidson
Published: 202510.1038/s41562-025-02360-w
Abstract
Multimodal large language models (MLLMs) could enhance the accuracy of automated content moderation by integrating contextual information. This study examines how MLLMs evaluate hate speech through a series of conjoint experiments. Models are provide…
Preview only. Read the full abstract at the source