Abstract Neural responses to visual stimuli exhibit complex temporal dynamics, including sub-additive temporal summation, response reduction with repeated or sustained stimuli (adaptation), and slower dynamics at low contrast. These phenomena are often studied independently. Here, we demonstrate these phenomena within the same experiment and model the underlying neural computations with a single computational model. We extracted time-varying responses from electrocorticographic (ECoG) recordings from patients presented with stimuli that varied in contrast, duration, and inter-stimulus interval (ISI). Aggregating data across patients yielded 98 electrodes with robust visual responses, covering both earlier (V1-V3) and higher-order (V3a/b, LO, TO, IPS) retinotopic maps. In all regions, the temporal dynamics of neural responses exhibit several non-linear features: peak response amplitude saturates with high contrast and longer stimulus durations; the response to a second stimulus is suppressed for short ISIs and recovers for longer ISIs; response latency decreases with increasing contrast. These features are accurately captured by a computational model comprised of a small set of canonical neuronal operations: linear filtering, rectification, exponentiation, and a delayed divisive normalization. We find that an increased normalization term captures both contrast- and adaptation-related response reductions, suggesting potentially shared underlying mechanisms. We additionally demonstrate both changes and invariance in temporal response dynamics between earlier and higher-order visual areas. Together, our results reveal the presence of a wide range of temporal and contrast-dependent neuronal dynamics in the human visual cortex, and demonstrate that a simple model captures these dynamics at millisecond resolution. Significance Statement Sensory inputs and neural responses change continuously over time. It is especially challenging to understand a system that has both dynamic inputs and outputs. Here we use a computational modeling approach that specifies computations to convert a time-varying input stimulus to a neural response time course, and use this to predict neural activity measured in the human visual cortex. We show that this computational model predicts a wide variety of complex neural response shapes that we induced experimentally by manipulating the duration, repetition and contrast of visual stimuli. By comparing data and model predictions, we uncover systematic properties of temporal dynamics of neural signals, allowing us to better understand how the brain processes dynamic sensory information.