ResearchHub | Open Science Community

Zeb Kurth‐Nelson

Author with expertise in Neuronal Oscillations in Cortical Networks

Achievements

Cited Author

Open Access Advocate

Key Stats

Upvotes received:

Publications:

(75% Open Access)

Cited by:

2,124

h-index:

i10-index:

Reputation

Biology

< 1%

Chemistry

< 1%

Economics

< 1%

How is this calculated?

Publications

Prefrontal cortex as a meta-reinforcement learning system

Jane Wang et al.May 11, 2018

Over the past 20 years, neuroscience research on reward-based learning has converged on a canonical model, under which the neurotransmitter dopamine 'stamps in' associations between situations, actions and rewards by modulating the strength of synaptic connections between neurons. However, a growing number of recent findings have placed this standard model under strain. We now draw on recent advances in artificial intelligence to introduce a new theory of reward-based learning. Here, the dopamine system trains another part of the brain, the prefrontal cortex, to operate as its own free-standing learning system. This new perspective accommodates the findings that motivated the standard model, but also deals gracefully with a wider range of observations, providing a fresh foundation for future research.

Artificial Intelligence

Molecular Biology

Paper

Artificial Intelligence

487

Save

Human Replay Spontaneously Reorganizes Experience

Yunzhe Liu et al.Jul 1, 2019

Knowledge abstracted from previous experiences can be transferred to aid new learning. Here, we asked whether such abstract knowledge immediately guides the replay of new experiences. We first trained participants on a rule defining an ordering of objects and then presented a novel set of objects in a scrambled order. Across two studies, we observed that representations of these novel objects were reactivated during a subsequent rest. As in rodents, human "replay" events occurred in sequences accelerated in time, compared to actual experience, and reversed their direction after a reward. Notably, replay did not simply recapitulate visual experience, but followed instead a sequence implied by learned abstract knowledge. Furthermore, each replay contained more than sensory representations of the relevant objects. A sensory code of object representations was preceded 50 ms by a code factorized into sequence position and sequence identity. We argue that this factorized representation facilitates the generalization of a previously learned structure to new objects.

Cell Biology

Cognitive Neuroscience

Paper

Cell Biology

410

Save

Reconciling reinforcement learning models with behavioral extinction and renewal: Implications for addiction, relapse, and problem gambling.

A. Redish et al.Jul 1, 2007

Because learned associations are quickly renewed following extinction, the extinction process must include processes other than unlearning. However, reinforcement learning models, such as the temporal difference reinforcement learning (TDRL) model, treat extinction as an unlearning of associated value and are thus unable to capture renewal. TDRL models are based on the hypothesis that dopamine carries a reward prediction error signal; these models predict reward by driving that reward error to zero. The authors construct a TDRL model that can accommodate extinction and renewal through two simple processes: (a) a TDRL process that learns the value of situation-action pairs and (b) a situation recognition process that categorizes the observed cues into situations. This model has implications for dysfunctional states, including relapse after addiction and problem gambling.

Artificial Intelligence

Paleontology

Paper

Artificial Intelligence

357

Save

A distributional code for value in dopamine-based reinforcement learning

Will Dabney et al.Jan 15, 2020

Since its introduction, the reward prediction error theory of dopamine has explained a wealth of empirical phenomena, providing a unifying framework for understanding the representation of reward and value in the brain1–3. According to the now canonical theory, reward predictions are represented as a single scalar quantity, which supports learning about the expectation, or mean, of stochastic outcomes. Here we propose an account of dopamine-based reinforcement learning inspired by recent artificial intelligence research on distributional reinforcement learning4–6. We hypothesized that the brain represents possible future rewards not as a single mean, but instead as a probability distribution, effectively representing multiple future outcomes simultaneously and in parallel. This idea implies a set of empirical predictions, which we tested using single-unit recordings from mouse ventral tegmental area. Our findings provide strong evidence for a neural realization of distributional reinforcement learning. Analyses of single-cell recordings from mouse ventral tegmental area are consistent with a model of reinforcement learning in which the brain represents possible future rewards not as a single mean of stochastic outcomes, as in the canonical model, but instead as a probability distribution.

Artificial Intelligence

Law

Paper

Artificial Intelligence

345

Save

Harm to others outweighs harm to self in moral decision making

Molly Crockett et al.Nov 17, 2014

Significance Concern for the welfare of others is a key component of moral decision making and is disturbed in antisocial and criminal behavior. However, little is known about how people evaluate the costs of others’ suffering. Past studies have examined people’s judgments in hypothetical scenarios, but there is evidence that hypothetical judgments cannot accurately predict actual behavior. Here we addressed this issue by measuring how much money people will sacrifice to reduce the number of painful electric shocks delivered to either themselves or an anonymous stranger. Surprisingly, most people sacrifice more money to reduce a stranger’s pain than their own pain. This finding may help us better understand how people resolve moral dilemmas that commonly arise in medical, legal, and political decision making.

Paper

Save

Fast Sequences of Non-spatial State Representations in Humans

Zeb Kurth‐Nelson et al.Jun 27, 2016

Fast internally generated sequences of neural representations are suggested to support learning and online planning. However, these sequences have only been studied in the context of spatial tasks and never in humans. Here, we recorded magnetoencephalography (MEG) while human subjects performed a novel non-spatial reasoning task. The task required selecting paths through a set of six visual objects. We trained pattern classifiers on the MEG activity elicited by direct presentation of the visual objects alone and tested these classifiers on activity recorded during periods when no object was presented. During these object-free periods, the brain spontaneously visited representations of approximately four objects in fast sequences lasting on the order of 120 ms. These sequences followed backward trajectories along the permissible paths in the task. Thus, spontaneous fast sequential representation of states can be measured non-invasively in humans, and these sequences may be a fundamental feature of neural computation across tasks.

Philosophy

Artificial Intelligence

Paper

Philosophy

201

Save

Generative replay for compositional visual understanding in the prefrontal-hippocampal circuit

Philipp Schwartenbeck et al.Jun 6, 2021

Abstract Understanding the visual world is a constructive process. Whilst a frontal-hippocampal circuit is known to be essential for this task, little is known about the associated neuronal computations. Visual understanding appears superficially distinct from other known functions of this circuit, such as spatial reasoning and model-based planning, but recent models suggest deeper computational similarities. Here, using fMRI, we show that representations of a simple visual scene in these brain regions are relational and compositional – key computational properties theorised to support rapid construction of hippocampal maps. Using MEG, we show that rapid sequences of representations, akin to replay in spatial navigation and planning problems, are also engaged in visual construction. Whilst these sequences have previously been proposed as mechanisms to plan possible futures or learn from the past, here they are used to understand the present. Replay sequences form constructive hypotheses about possible scene configurations. These hypotheses play out in an optimal order for relational inference, progressing from predictable to uncertain scene elements, gradually constraining possible configurations, and converging on the correct scene configuration. Together, these results suggest a computational bridge between apparently distinct functions of hippocampal-prefrontal circuitry, and a role for generative replay in constructive inference and hypothesis testing.

Artificial Intelligence

Psychology

Paper

Artificial Intelligence

Save

Replay bursts coincide with activation of the default mode and parietal alpha network

Cameron Higgins et al.Jun 24, 2020

Abstract Our brains at rest spontaneously replay recently acquired information, but how this process is orchestrated to avoid interference with ongoing cognition is an open question. We investigated whether replay coincided with spontaneous patterns of whole brain activity. We found, in two separate datasets, that replay sequences were packaged into transient bursts occurring selectively during activation of the default mode network (DMN) and parietal alpha network. These networks were characterized by widespread synchronized oscillations coupled to increases in ripple band power, mechanisms that coordinate information flow between disparate cortical areas. Our data show a tight correspondence between two widely studied phenomena of neural physiology and suggest the DMN may coordinate replay bursts in a manner that minimizes interference with ongoing cognition.

Philosophy

Clinical Psychology

Paper

Philosophy

Save

Measuring Sequences of Representations with Temporally Delayed Linear Modelling

Yunzhe Liu et al.May 2, 2020

SUMMARY There are rich structures in off-task neural activity. For example, task related neural codes are thought to be reactivated in a systematic way during rest. This reactivation is hypothesised to reflect a fundamental computation that supports a variety of cognitive functions. Here, we introduce an analysis toolkit (TDLM) for analysing this activity. TDLM combines nonlinear classification and linear temporal modelling to testing for statistical regularities in sequences of neural representations. It is developed using non-invasive neuroimaging data and is designed to take care of confounds and maximize sequence detection ability. The method can be extended to rodent electrophysiological recordings. We outline how TDLM can successfully reveal human replay during rest, based upon non-invasive magnetoencephalography (MEG) measurements, with strong parallels to rodent hippocampal replay. TDLM can therefore advance our understanding of sequential computation and promote a richer convergence between animal and human neuroscience research.

Artificial Intelligence

Cognitive Neuroscience

Paper

Artificial Intelligence

Save

Distributional reinforcement learning in prefrontal cortex

Timothy Muller et al.Jun 15, 2021

Abstract Prefrontal cortex is crucial for learning and decision-making. Classic reinforcement learning (RL) theories centre on learning the expectation of potential rewarding outcomes and explain a wealth of neural data in prefrontal cortex. Distributional RL, on the other hand, learns the full distribution of rewarding outcomes and better explains dopamine responses. Here we show distributional RL also better explains prefrontal cortical responses, suggesting it is a ubiquitous mechanism for reward-guided learning.

Artificial Intelligence

Molecular Biology

Paper

Artificial Intelligence

Save