Thyroid nodules are a very common entity. The overall prevalence in the populace is estimated to be around 65–68%, among which a small portion (less than 5%) is malignant (cancerous). Therefore, it is important to discriminate benign thyroid nodules from malignant thyroid nodules. In this study, an equal number of participants with benign and malignant thyroid nodules (N = 10/group) were recruited. Saliva samples were collected from each participant, and SERS spectra were acquired, followed by validation using a metabolomics approach. An additional equal number of patients (N = 40/group) were recruited to construct diagnostic models. The performance of various machine learning (ML) algorithms was assessed using multiple evaluation metrics. Finally, the reliability of the optimal model was tested using blind test data (N = 10/group for benign and malignant thyroid nodules). The results showed a consistent trend between the SERS metabolic profile and the metabolites identified through MS analysis. The Multi-ResNet algorithm was optimal, achieving a 95% accuracy in sample discrimination. Additionally, blind test data sets yielded an overall accuracy of 83%. In summary, the deep-learning-guided SERS technique holds great potential in the accurate discrimination of benign and malignant thyroid nodules via human saliva samples, which facilitates the noninvasive diagnosis of malignant thyroid nodules in clinical settings.