Abstract Real-time fMRI neurofeedback is an increasingly popular neuroimaging technique that allows an individual to gain control over his/her own brain signals, which can lead to improvements in behavior in healthy participants as well as to improvements of clinical symptoms in patient populations. However, a considerably large ratio of participants undergoing neurofeedback training do not learn to control their own brain signals and, consequently, do not benefit from neurofeedback interventions, which limits clinical efficacy of neurofeedback interventions. As neurofeedback success varies between studies and participants, it is important to identify factors that might influence neurofeedback success. Here, for the first time, we employed a big data machine learning approach to investigate the influence of 20 different design-specific (e.g. activity vs. connectivity feedback), region of interest-specific (e.g. cortical vs. subcortical) and subject-specific factors (e.g. age) on neurofeedback performance and improvement in 608 participants from 28 independent experiments. With a classification accuracy of 60% (considerably different from chance level), we identified two factors that significantly influenced neurofeedback performance: Both the inclusion of a pre-training no-feedback run before neurofeedback training and neurofeedback training of patients as compared to healthy participants were associated with better neurofeedback performance. The positive effect of pre-training no-feedback runs on neurofeedback performance might be due to the familiarization of participants with the neurofeedback setup and the mental imagery task before neurofeedback training runs. Better performance of patients as compared to healthy participants might be driven by higher motivation of patients, higher ranges for the regulation of dysfunctional brain signals, or a more extensive piloting of clinical experimental paradigms. Due to the large heterogeneity of our dataset, these findings likely generalize across neurofeedback studies, thus providing guidance for designing more efficient neurofeedback studies specifically for improving clinical neurofeedback-based interventions. To facilitate the development of data-driven recommendations for specific design details and subpopulations the field would benefit from stronger engagement in Open Science and data sharing.