Abstract BACKGROUND Radiologically presumed diffuse lower-grade glioma (dLGG) are typically non or minimal enhancing tumors, with hyperintensity in T2w-images. The aim of this study was to test the clinical usefulness of deep learning (DL) in IDH mutation prediction in patients with radiologically presumed dLGG. METHODS 314 patients were retrospectively recruited from six neurosurgical departments in Sweden, Norway, France, Austria, and the United States. Collected data included patients’ age, sex, tumor molecular characteristics (IDH, and 1p19q), and routine preoperative radiological images. A clinical model was built using multivariable logistic regression with the variables age and tumor location. DL models were built using MRI data only, and four DL architectures used in glioma research. In the final validation test, the clinical model and the best DL model were scored on an external validation cohort with 155 patients from the Erasmus Glioma Dataset. RESULTS The mean age in the recruited and external cohorts was 45.0 (SD 14.3) and 44.3 years (SD 14.6). The cohorts were rather similar, except for sex distribution (53.5% vs 64.5% males, p-value 0.03) and IDH status (30.9% vs 12.9% IDH wild-type, p-value <0.01). Overall, the area under the curve for the prediction of IDH mutations in the external validation cohort was 0.86, 0.82, and 0.87 for the clinical model, the DL model, and the model combining both models’ probabilities. CONCLUSIONS In their current state, when these complex models were applied to our clinical scenario, they did not seem to provide a net gain compared to our baseline clinical model.