ABSTRACT The calculation of so-called “brain age” has been an emerging biomarker in aging research. Data suggests that discrepancies between chronological age and the predicted age of the brain may be predictive of mortality and morbidity (for review, see Cole, Marioni, Harris, & Deary, 2019). However, with these promising results come technical complexities of how to calculate brain age. Various groups have deployed methods leveraging different statistical approaches, often crafting novel algorithms for assessing this biomarker. There remain many open questions about the reliability, collinearity, and predictive power of different algorithms. Here, we complete a rigorous systematic comparison of three commonly used, previously published brain age algorithms (XGBoost, brainageR, and DeepBrainNet) to serve as a foundation for future applied research. First, using multiple datasets with repeated MRI scans, we calculated two metrics of reliability (intraclass correlations and Bland–Altman bias). We then considered correlations between brain age variables, chronological age, biological sex, and image quality. We also calculated the magnitude of collinearity between approaches. Finally, we used canonical regression and machine learning approaches to identify significant predictors across brain age algorithms related to clinical diagnoses of mild cognitive impairment or Alzheimer’s Disease. Using a large sample ( N=2557 ), we find all three commonly used brain age algorithms demonstrate excellent reliability (r>.9). We also note that brainageR and DeepBrainNet are reasonably correlated with one another, and that the XGBoost brain age is strongly related to image quality. Finally, and notably, we find that XGBoost brain age calculations were more sensitive to the detection of clinical diagnoses of mild cognitive impairment or Alzheimer’s Disease. We close this work with recommendations for future research studies focused on brain age.