To operationalize frailty using eight scales and to compare their content validity, feasibility, prevalence estimates of frailty, and ability to predict all-cause mortality.Secondary analysis of the Survey of Health, Ageing and Retirement in Europe (SHARE).Eleven European countries.Individuals aged 50 to 104 (mean age 65.3 ± 10.5, 54.8% female, N = 27,527).Frailty was operationalized using SHARE data based on the Groningen Frailty Indicator, the Tilburg Frailty Indicator, a 70-item Frailty Index (FI), a 44-item FI based on a Comprehensive Geriatric Assessment (FI-CGA), the Clinical Frailty Scale, frailty phenotype (weighted and unweighted versions), the Edmonton Frail Scale, and the FRAIL scale.All scales had fewer than 6% of cases with at least one missing item, except the SHARE-frailty phenotype (11.1%) and the SHARE-Tilburg (12.2%). In the SHARE-Groningen, SHARE-Tilburg, SHARE-frailty phenotype, and SHARE-FRAIL scales, death rates were 3 to 5 times as high in excluded cases as in included ones. Frailty prevalence estimates ranged from 6% (SHARE-FRAIL) to 44% (SHARE-Groningen). All scales categorized 2.4% of participants as frail. Of unweighted scales, the SHARE-FI and SHARE-Edmonton scales most accurately predicted mortality at 2 (SHARE-FI area under the receiver operating characteristic curve (AUC) = 0.77, 95% confidence interval (CI) = 0.75-0.79); SHARE-Edmonton AUC = 0.76, 95% CI = 0.74-0.79) and 5 (both AUC = 0.75, 95% CI = 0.74-0.77) years. The continuous score of the weighted SHARE-frailty phenotype (AUC = 0.77, 95% CI = 0.75-0.78) predicted 5-year mortality better than the unweighted SHARE-frailty phenotype (AUC = 0.70, 95% CI = 0.68-0.71), but the categorical score of the weighted SHARE-frailty phenotype did not (AUC = 0.70, 95% CI = 0.68-0.72).Substantive differences exist between scales in their content validity, feasibility, and ability to predict all-cause mortality. These frailty scales capture related but distinct groups. Weighting items in frailty scales can improve their predictive ability, but the trade-off between specificity, predictive power, and generalizability requires additional evaluation.