Abstract Background The hippocampal-to-ventricle ratio (HVR) is a biomarker of medial temporal atrophy, particularly useful in the assessment of neurodegeneration in diseases such as Alzheimer’s disease (AD). To minimize subjectivity and inter-rater variability, an automated, accurate, precise, and reliable segmentation technique for the hippocampus (HC) and surrounding cerebro-spinal fluid (CSF) filled spaces — such as the temporal horns of the lateral ventricles — is essential. Methods We trained and evaluated three automated methods for the segmentation of both HC and CSF (Multi-Atlas Label Fusion (MALF), Nonlinear Patch-Based Segmentation (NLPB), and a Convolutional Neural Network (CNN)). We then evaluated these methods, including the widely used FreeSurfer technique, using baseline T1w MRIs of 1,641 participants from the AD Neuroimaging Initiative study with various degree of atrophy associated with their cognitive status on the spectrum from cognitively healthy to clinically probable AD. Our gold standard consisted in manual segmentation of HC and CSF from 80 cognitively healthy individuals. We calculated HC volumes and HVR and compared all methods in terms of segmentation reliability, similarity across methods, sensitivity in detecting between-group differences and associations with age, scores of the learning subtest of the Rey Auditory Verbal Learning Test (RAVLT) and the Alzheimer’s Disease Assessment Scale 13 (ADAS13) scores. Results Cross validation demonstrated that the CNN method yielded more accurate HC and CSF segmentations when compared to MALF and NLPB, demonstrating higher volumetric overlap (Dice Kappa = 0.94) and correlation (rho = 0.99) with the manual labels. It was also the most reliable method in clinical data application, showing minimal failures. Our comparisons yielded high correlations between FreeSurfer, CNN and NLPB volumetric values. HVR yielded higher control:AD effect sizes than HC volumes among all segmentation methods, reinforcing the significance of HVR in clinical distinction. Associations The positive association with age was significantly stronger for HVR compared to HC volumes on all methods except FreeSurfer. Memory associations with HC volumes or HVR were only significant for individuals with mild cognitive impairment. Finally, the HC volumes and HVR showed comparable negative associations with ADAS13, particularly in the mild cognitive impairment cohort. Conclusion This study provides an evaluation of automated segmentation methods centered to estimate HVR, emphasizing the superior performance of a CNN-based algorithm. The findings underscore the pivotal role of accurate segmentation in HVR calculations for precise clinical applications, contributing valuable insights into medial temporal lobe atrophy in neurodegenerative disorders, especially AD. Authorship Sofia Fernandez-Lozano: Conceptualization, Methodology, Software, Investigation, Writing – Original Draft, Visualization. Vladimir Fonov: Software, Data Curation. Dorothee Schoemaker: Resources, Writing – Review & Editing. Jens Pruessner: Resources, Writing – Review & Editing. Olivier Potvin: Resources, Writing – Review & Editing. Simon Duchesne: Resources, Writing – Review & Editing. D. Louis Collins: Conceptualization, Writing – Review & Editing, Supervision.