New RL4HS Framework Outperforms Existing Models in Detecting Hallucinated Spans in AI-Generated Text

Mar 05, 2026

Apple Machine Learning Research

Article image for New RL4HS Framework Outperforms Existing Models in Detecting Hallucinated Spans in AI-Generated Text

Summary

A new reinforcement learning framework called RL4HS is outperforming existing AI models in detecting hallucinated spans in large language model outputs, using Group Relative Policy Optimization and a novel Class-Aware Policy Optimization technique to deliver superior results across summarization, question answering, and data-to-text tasks.

Key Points

Researchers introduce RL4HS, a reinforcement learning framework designed to detect hallucinated spans in large language model outputs, going beyond simple binary hallucination detection.
RL4HS leverages Group Relative Policy Optimization and a new Class-Aware Policy Optimization technique to address reward imbalance, incentivizing step-by-step reasoning at the span level.
Testing on the RAGTruth benchmark across summarization, question answering, and data-to-text tasks confirms that RL4HS outperforms both pretrained reasoning models and supervised fine-tuning approaches.

New RL4HS Framework Outperforms Existing Models in Detecting Hallucinated Spans in AI-Generated Text

Summary

Key Points

Tags