ResearchHub | Open Science Community

A deep learning framework for neuroscience

Blake Richards et al.Oct 28, 2019

Systems neuroscience seeks explanations for how the brain implements a wide variety of perceptual, cognitive and motor tasks. Conversely, artificial intelligence attempts to design computational systems based on the tasks they will have to solve. In artificial neural networks, the three components specified by design are the objective functions, the learning rules and the architectures. With the growing success of deep learning, which utilizes brain-inspired architectures, these three designed components have increasingly become central to how we model, engineer and optimize complex artificial learning systems. Here we argue that a greater focus on these components would also benefit systems neuroscience. We give examples of how this optimization-based framework can drive theoretical and experimental progress in neuroscience. We contend that this principled perspective on systems neuroscience will help to generate more rapid progress. A deep network is best understood in terms of components used to design it—objective functions, architecture and learning rules—rather than unit-by-unit computation. Richards et al. argue that this inspires fruitful approaches to systems neuroscience.

Artificial Intelligence

Cognitive Neuroscience

6

Paper

Artificial Intelligence

692

0

Save

99

Learning from unexpected events in the neocortical microcircuit

Colleen Gillon et al.Jan 16, 2021

Abstract Scientists have long conjectured that the neocortex learns the structure of the environment in a predictive, hierarchical manner. According to this conjecture, expected, predictable features are differentiated from unexpected ones by comparing bottom-up and top-down streams of information. It is theorized that the neocortex then changes the representation of incoming stimuli, guided by differences in the responses to expected and unexpected events. In line with this conjecture, different responses to expected and unexpected sensory features have been observed in spiking and somatic calcium events. However, it remains unknown whether these unexpected event signals occur in the distal apical dendrites where many top-down signals are received, and whether these signals govern subsequent changes in the brain’s stimulus representations. Here, we show that both somata and distal apical dendrites of cortical pyramidal neurons exhibit distinct unexpected event signals that systematically change over days. These findings were obtained by tracking the responses of individual somata and dendritic branches of layer 2/3 and layer 5 pyramidal neurons over multiple days in primary visual cortex of awake, behaving mice using two-photon calcium imaging. Many neurons in both layers 2/3 and 5 showed large differences between their responses to expected and unexpected events. Interestingly, these responses evolved in opposite directions in the somata and distal apical dendrites. These differences between the somata and distal apical dendrites may be important for hierarchical computation, given that these two compartments tend to receive bottom-up and top-down information, respectively.

Cognitive Neuroscience

Organic Chemistry

99

Paper

Cognitive Neuroscience

29

0

Save

141

The functional specialization of visual cortex emerges from training parallel pathways with self-supervised predictive learning

Shahab Bakhtiari et al.Jun 18, 2021

Abstract The visual system of mammals is comprised of parallel, hierarchical specialized pathways. Different pathways are specialized in so far as they use representations that are more suitable for supporting specific downstream behaviours. In particular, the clearest example is the specialization of the ventral (“what”) and dorsal (“where”) pathways of the visual cortex. These two pathways support behaviours related to visual recognition and movement, respectively. To-date, deep neural networks have mostly been used as models of the ventral, recognition pathway. However, it is unknown whether both pathways can be modelled with a single deep ANN. Here, we ask whether a single model with a single loss function can capture the properties of both the ventral and the dorsal pathways. We explore this question using data from mice, who like other mammals, have specialized pathways that appear to support recognition and movement behaviours. We show that when we train a deep neural network architecture with two parallel pathways using a self-supervised predictive loss function, we can outperform other models in fitting mouse visual cortex. Moreover, we can model both the dorsal and ventral pathways. These results demonstrate that a self-supervised predictive learning approach applied to parallel pathway architectures can account for some of the functional specialization seen in mammalian visual systems.

Artificial Intelligence

Cognitive Neuroscience

141

Paper

Artificial Intelligence

25

0

Save

0

Deep learning for brains?: Different linear and nonlinear scaling in UK Biobank brain images vs. machine-learning datasets

Marc‐Andre Schulz et al.Sep 6, 2019

Abstract In recent years, deep learning has unlocked unprecedented success in various domains, especially in image, text, and speech processing. These breakthroughs may hold promise for neuroscience and especially for brain-imaging investigators who start to analyze thousands of participants. However, deep learning is only beneficial if the data have nonlinear relationships and if they are exploitable at currently available sample sizes. We systematically profiled the performance of deep models, kernel models, and linear models as a function of sample size on UK Biobank brain images against established machine learning references. On MNIST and Zalando Fashion, prediction accuracy consistently improved when escalating from linear models to shallow-nonlinear models, and further improved when switching to deep-nonlinear models. The more observations were available for model training, the greater the performance gain we saw. In contrast, using structural or functional brain scans, simple linear models performed on par with more complex, highly parameterized models in age/sex prediction across increasing sample sizes. In fact, linear models kept improving as the sample size approached ∼10,000 participants. Our results indicate that the increase in performance of linear models with additional data does not saturate at the limit of current feasibility. Yet, nonlinearities of common brain scans remain largely inaccessible to both kernel and deep learning methods at any examined scale.

Genetics

Artificial Intelligence

0

Paper

Save

Learning better with Dale’s Law: A Spectral Perspective

Pingsheng Li et al.Jun 30, 2023

Abstract Most recurrent neural networks (RNNs) do not include a fundamental constraint of real neural circuits: Dale’s Law, which implies that neurons must be excitatory (E) or inhibitory (I). Dale’s Law is generally absent from RNNs because simply partitioning a standard network’s units into E and I populations impairs learning. However, here we extend a recent feedforward bio-inspired EI network architecture, named Dale’s ANNs, to recurrent networks, and demonstrate that good performance is possible while respecting Dale’s Law. This begs the question: What makes some forms of EI network learn poorly and others learn well? And, why does the simple approach of incorporating Dale’s Law impair learning? Historically the answer was thought to be the sign constraints on EI network parameters, and this was a motivation behind Dale’s ANNs. However, here we show the spectral properties of the recurrent weight matrix at initialisation are more impactful on network performance than sign constraints. We find that simple EI partitioning results in a singular value distribution that is multimodal and dispersed, whereas standard RNNs have an unimodal, more clustered singular value distribution, as do recurrent Dale’s ANNs. We also show that the spectral properties and performance of partitioned EI networks are worse for small networks with fewer I units, and we present normalised SVD entropy as a measure of spectrum pathology that correlates with performance. Overall, this work sheds light on a long-standing mystery in neuroscience-inspired AI and computational neuroscience, paving the way for greater alignment between neural networks and biology.

Artificial Intelligence

Law

5

Paper

Artificial Intelligence

2

0

Save

5

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

David Raposo et al.Apr 2, 2024

Transformer-based language models spread FLOPs uniformly across input sequences. In this work we demonstrate that transformers can instead learn to dynamically allocate FLOPs (or compute) to specific positions in a sequence, optimising the allocation along the sequence for different layers across the model depth. Our method enforces a total compute budget by capping the number of tokens ($k$) that can participate in the self-attention and MLP computations at a given layer. The tokens to be processed are determined by the network using a top-$k$ routing mechanism. Since $k$ is defined a priori, this simple procedure uses a static computation graph with known tensor sizes, unlike other conditional computation techniques. Nevertheless, since the identities of the $k$ tokens are fluid, this method can expend FLOPs non-uniformly across the time and model depth dimensions. Thus, compute expenditure is entirely predictable in sum total, but dynamic and context-sensitive at the token-level. Not only do models trained in this way learn to dynamically allocate compute, they do so efficiently. These models match baseline performance for equivalent FLOPS and wall-clock times to train, but require a fraction of the FLOPs per forward pass, and can be upwards of 50\% faster to step during post-training sampling.

Artificial Intelligence

Machine Learning

5

Paper

Artificial Intelligence

5.0

2

Save

0

The feature landscape of visual cortex

Rudi Tong et al.Jan 1, 2023

Understanding computations in the visual system requires a characterization of the distinct feature preferences of neurons in different visual cortical areas. However, we know little about how feature preferences of neurons within a given area relate to that area9s role within the global organization of visual cortex. To address this, we recorded from thousands of neurons across six visual cortical areas in mouse and leveraged generative AI methods combined with closed-loop neuronal recordings to identify each neuron9s visual feature preference. First, we discovered that the mouse9s visual system is globally organized to encode features in a manner invariant to the types of image transformations induced by self-motion. Second, we found differences in the visual feature preferences of each area and that these differences generalized across animals. Finally, we observed that a given area9s collection of preferred stimuli (9own-stimuli9) drive neurons from the same area more effectively through their dynamic range compared to preferred stimuli from other areas (9other-stimuli9). As a result, feature preferences of neurons within an area are organized to maximally encode differences among own-stimuli while remaining insensitive to differences among other-stimuli. These results reveal how visual areas work together to efficiently encode information about the external world.

Philosophy

Artificial Intelligence

0

Paper

Philosophy

Artificial Intelligence

0

Save

0

Relevance learning via inhibitory plasticity and its implications for schizophrenia

Nathan Insel et al.Jul 10, 2017

Symptoms of schizophrenia may arise from a failure of cortical circuits to filter-out irrelevant inputs. Schizophrenia has also been linked to disruptions to cortical inhibitory interneurons, consistent with the possibility that in the normally functioning brain, these cells are in some part responsible for determining which inputs are relevant and which irrelevant. Here, we develop an abstract but biologically plausible neural network model that demonstrates how the cortex may learn to ignore irrelevant inputs through plasticity processes affecting inhibition. The model is based on the proposal that the amount of excitatory output from a cortical circuit encodes expected magnitude of reward or punishment ("relevance"), which can be trained using a temporal difference learning mechanism acting on feed-forward inputs to inhibitory interneurons. The model exhibits learned irrelevance and blocking, which become impaired following disruptions to inhibitory units. When excitatory units are connected to a competitive-learning output layer, the relevance code is capable of modulating learning and activity. Accordingly, the combined network is capable of recapitulating published experimental data linking inhibition in frontal cortex with fear learning and expression. Finally, the model demonstrates how relevance learning can take place in parallel with other types of learning, through plasticity rules involving inhibitory and excitatory components respectively. Altogether, this work offers a theory of how the cortex learns to selectively inhibit inputs, providing insight into how relevance-assignment problems may emerge in schizophrenia.

Biochemistry

Law

0

Paper

Save

Distinct roles of parvalbumin and somatostatin interneurons in the synchronization of spike-times in the neocortex

Hyun Jang et al.Jun 15, 2019

Synchronization of precise spike-times across multiple neurons carries information about sensory stimuli. Inhibitory interneurons are suggested to promote this synchronization, but it is unclear whether distinct interneuron subtypes provide different contributions. To test this, we examined single-unit recordings from barrel cortex in vivo and used optogenetics to determine the contribution of two classes of inhibitory interneurons: parvalbumin (PV)- and somatostatin (SST)-positive interneurons to spike-timing synchronization across cortical layers. We found that PV interneurons preferentially promote the synchronization of spike-times when instantaneous firing-rates are low (<12 Hz), whereas SST interneurons preferentially promote the synchronization of spike-times when instantaneous firing-rates are high (>12 Hz). Furthermore, using a computational model, we demonstrate that these effects can be explained by PV and SST interneurons having preferential contribution to feedforward and feedback inhibition, respectively. Our findings demonstrate that distinct subtypes of inhibitory interneurons have frequency-selective roles in spatio-temporal synchronization of precise spike-times.

Cognitive Neuroscience

Cellular And Molecular Neuroscience

0

Paper

Cognitive Neuroscience

Cellular And Molecular Neuroscience

0

Save