simmediumrlmetric · varies

Human Supervision as an Information Bottleneck: A Unified Theory of Error Floors in Human-Guided Learning

Description

Large language models are trained primarily on human-generated data and feedback, yet they exhibit persistent errors arising from annotation noise, subjective preferences, and the limited expressive bandwidth of natural language. We argue that these limitations reflect structural properties of the supervision channel rather than model scale or optimization. We develop a unified theory showing that whenever the human supervision channel is not sufficient for a latent evaluation target, it acts as

Source

http://arxiv.org/abs/2602.23446v1