Wind turbines and high-voltage transmission towers, as critical energy facilities, pose certain challenges in detection under extreme weather conditions. To enhance the accuracy of target identification for these two facilities, this study designed an experiment using the YOLOv5-CBAM model, where a mixed dataset of wind turbines and high-voltage transmission towers was trained under this model to create a joint model for combined recognition. Additionally, single models were trained using individual datasets for wind turbines and high-voltage transmission towers to perform independent identifications, the experimental results indicate a clear advantage in accuracy for the independent identification method. Specifically, in the joint model, the accuracy rates for wind turbines and high-voltage transmission towers were 92.1% and 74.1%, respectively, while in the single models, these rates increased to 99% and 92.3%. This improvement is significant for enhancing the performance of automatic monitoring of energy facilities in complex environments.