Abstract Multidrug-resistant (MDR) and extensively drug resistant (XDR) Mycobacterium tuberculosis complex (MTBC) strains are a great challenge for tuberculosis (TB) control in India. Still, factors driving the MDR/XDR epidemic in India are not well defined. To address this, whole genome sequencing (WGS) data from 1 852 MTBC strains obtained from patients from a tertiary care hospital laboratory in Mumbai were used for phylogenetic strain classification, resistance prediction, and cluster analysis (12 allele distance threshold). Factors associated with pre-XDR/XDR-TB were defined by odds ratios and a multivariate logistic regression model. Overall, 1 017 MTBC strains were MDR, out of which 57.8 % (n=591) were pre-XDR, and 17.9 % (n=183) were XDR. Lineage 2 (L2) strains represented 41.7 % of the MDR, 77.2 % of the pre-XDR, and 86.3 % of the XDR strains, and were significantly associated with pre-XDR/XDR-TB (P < 0.001). Cluster rates were high among MDR (78 %) and pre-XDR/XDR (85 %) strains with three dominant L2 strain clusters (Cl 1-3) representing half of the pre-XDR and two thirds of the XDR-TB cases. Cl 1 strains accounted for 52.5 % of the XDR MTBC strains. Transmission could be confirmed by identical mutation patterns of particular pre-XDR/XDR strains. As a conclusion high rates of pre-XDR/XDR strains among MDR-TB patients require rapid changes in treatment and control strategies. Transmission of particular pre-XDR/XDR L2 strains is the main driver of the pre-XDR/XDR-TB epidemic. Accordingly, control of the epidemic in the region requires measures with stopping transmission especially of pre-XDR/XDR L2 strains.