Abstract Results of neuroimaging datasets aggregated from multiple sites may be biased by site- specific profiles in participants’ demographic and clinical characteristics, as well as MRI acquisition protocols and scanning platforms. We compared the impact of four different harmonization methods on results obtained from analyses of cortical thickness data: (1) linear mixed-effects model (LME) that models site-specific random intercepts (LME INT ), (2) LME that models both site-specific random intercepts and age-related random slopes (LME INT+SLP ), (3) ComBat, and (4) ComBat with a generalized additive model (ComBat-GAM). Our test case for comparing harmonization methods was cortical thickness data aggregated from 29 sites, which included 1,343 cases with posttraumatic stress disorder (PTSD) (6.2-81.8 years old) and 2,067 trauma-exposed controls without PTSD (6.3-85.2 years old). We found that, compared to the other data harmonization methods, data processed with ComBat-GAM were more sensitive to the detection of significant case-control differences in regional cortical thickness ( X 2 (3) = 34.339, p < 0.001), and case-control differences in age-related cortical thinning ( X 2 (3) = 15.128, p = 0.002). Specifically, ComBat-GAM led to larger effect size estimates of cortical thickness reductions (corrected p-values < 0.001 ), smaller age-appropriate declines (corrected p-values < 0.001 ), and lower female to male contrast (corrected p-values < 0.001 ) in cases compared to controls relative to other harmonization methods. Harmonization with ComBat-GAM also led to greater estimates of age-related declines in cortical thickness (corrected p-values < 0.001 ) in both cases and controls compared to other harmonization methods. Our results support the use of ComBat-GAM for harmonizing cortical thickness data aggregated from multiple sites and scanners to minimize confounds and increase statistical power.