Abstract Perceptual decision-making is highly dependent on the momentary arousal state of the brain, which fluctuates over time on a scale of hours, minutes, and even seconds. The textbook relationship between momentary arousal and task performance is captured by an inverted U-shape, as put forward in the Yerkes-Dodson law (Yerkes and Dodson, 1908). This law suggests optimal performance at moderate levels of arousal, and impaired performance at low or high arousal levels. However, despite its popularity, the evidence for this relationship in humans is mixed at best. Here, we use pupil-indexed arousal and performance data from various perceptual decision-making tasks to provide converging evidence for the inverted U-shaped relationship between arousal and performance across different task types (discrimination, detection) and modalities (visual, auditory). To further understand this relationship, we built a neurobiologically plausible mechanistic model and show that it is possible to reproduce our findings by incorporating two types of interneurons that are both modulated by an arousal signal. The model architecture produces two dynamical regimes under the influence of arousal: one regime in which performance increases with arousal, and another regime in which performance decreases with arousal, together forming an inverted U-shaped arousal-performance relationship. We conclude that the inverted U-shaped arousal-performance relationship is a general and robust property of sensory processing. It might be brought about by the influence of arousal on two type of interneurons that together act as a disinhibitory pathway for the neural populations that encode the available sensory evidence used for the decision. Significance statement When people are repeatedly performing the exact same task, their performance varies over time. This behavioral variability is partly caused by spontaneous fluctuations in arousal. According to the Yerkes-Dodson law, task performance is optimal at moderate levels of species’ arousal, with impaired performance at too low or too high arousal levels. However, until now, the evidence supporting this law as a general mechanism in human decision making is mixed, and a neural mechanism that may explain the inverted-U shaped arousal-performance relationship is lacking. We show that the Yerkes-Dodson law is a general law that holds for human observers across decision-making tasks and settings. Furthermore, we present a simple and neurobiologically plausible mechanistic model that can explain its existence.