This study proposes a multimodal feature alignment network for early prediction of new-onset conduction disturbance after the Transcatheter Aortic Valve Replacement (TAVR) surgery. Based on medical prior knowledge, the baseline clinical information, calcification in the aortic root complex region, radionics features, and the CT slices , are used as multimodal data, and feature selection methods are used to select features that are highly correlated with the disease. Then, by using multimodal feature alignment and cross attention mechanisms between patterns, the data from multiple patterns is fully fused together to achieve information complementarity and prediction completion. Specifically, cosine similarity is employed to measure the similarity between image features and composite features in the feature space and perform feature alignment. Then, a cross-attention mechanism is employed to enhance the interdependence between composite features and image features, improve the predictive performance and interpretability of the model, and effectively integrate multimodal features. Experimental results show that our method achieved 90.48%, 84.21%, 94.12%, and 88.00% in accuracy, precision, recall, and specificity, respectively. As far as we know, we have not found similar reports on this topic. The proposed method provides a novel approach for predicting complications after TAVR, which is with great significance in clinical practice of the cardiac surgery.