Abstract Background The robustness of conventional amyloid PET harmonization across tracers has been questioned. Purpose To evaluate deep learning-based harmonization of amyloid PET in predicting conversion from cognitively unimpaired (CU) to mild cognitive impairment (MCI) and MCI to Alzheimer’s disease (AD). Methods We developed an amyloid PET-based deep-learning model to classify participants with a clinical diagnosis of AD-dementia vs. CU across different tracers from the Alzheimer’s Disease Neuroimaging Initiative (ADNI), Japanese ADNI, and Australian Imaging, Biomarker, and Lifestyle cohorts (n = 1,050). The model output (DL-ADprob), with other prognostic factors, was evaluated for predicting cognitive decline in ADNI-MCI (n = 451) and Harvard Aging Brain Study (HABS)-CU (n = 271) participants using Cox regression and area under time-dependent receiver operating characteristics curve (tdAUC) at 4-year follow-up. Subgroup analyses were performed in the ADNI-MCI group for conversion from amyloid-positive to AD and from amyloid negative to positive. Intraclass correlation coefficient (ICC) of DL-ADprob between tracers was calculated in the Global Alzheimer's Association Interactive Network dataset (n = 155). Results DL-ADprob was independently prognostic in both ADNI-MCI (P < 0.001) and HABS-CU (P = 0.048) sets. Adding DL-ADprob to other factors increased prognostic performances in both ADNI-MCI (tdAUC 0.758 [0.721–0.792] vs. 0.782 [0.742–0.818], tdAUC difference 0.023 [0.007–0.038]) and HABS-CU (tdAUC 0.846 [0.755–0.925] vs. 0.870 [0.773–0.943], tdAUC difference 0.022 [-0.004–0.053]). DL-ADprob was independently prognostic in amyloid-positive (P < 0.001) and amyloid-negative subgroups (P = 0.007). DL-ADprob showed incremental prognostic value in amyloid-positive (tdAUC 0.666 [0.623–0.713] vs. 0.706 [0.657–0.755], tdAUC difference 0.039 [0.016–0.064]), but not in amyloid-negative (tdAUC 0.818 [0.757–0.882] vs. 0.816 [0.751–0.880], tdAUC difference -0.002 [-0.031–0.029]) subgroup. The pairwise ICCs of DL-ADprob between Pittsburgh compound B and florbetapir, florbetaben, and flutemetamol respectively ranged from 0.913 to 0.935. Conclusion Deep learning-based harmonization of amyloid PET improves cognitive decline prediction in non-demented elderly, suggesting it could complement conventional amyloid PET measures.