DNA-protein interaction is one of the most crucial interactions in the biological system, which decide the fate of many processes such as transcription, regulation of gene expression, splicing, and many more. Though many computational approaches exist that can predict the DNA interacting residues from the protein sequences, there is still a significant opportunity for improvement in terms of performance and accessibility. In this study, we have downloaded the benchmark dataset from method hybridNAP and recently published method ProNA2020, for training and validation purposes, that comprise 864 and 308 proteins, respectively. We have implemented CD-HIT software to handle the redundancy with 30% identity, and left with 646 proteins for training and 46 proteins for validation purposes, in which the validation dataset do not share more than 30% of sequence identity with the training dataset. We have generated amino acid binary profiles, physicochemical-properties based binary profiles, PSSM profiles, and a combination of all profiles described as hybrid feature. 1D-CNN based model performed best as compared to other models for each set of features. The model developed using amino acid binary profile achieved AUROC of 0.83 and 0.74 for training and validation dataset. Using physicochemical properties based binary profile, model attained AUROC of 0.86 and 0.73 for training and validation dataset. Model generated using PSSM profile resulted in the better performance with AUROC 0.91 and 0.74 for training and validation dataset. And, model developed using hybrid of all features performed best with AUROC of 0.91, and 0.79 for training and validation dataset, respectively. We have compared our methods performance with the current approach and shown improvements. We have included the best-performing models in the standalone and web server accessible at https://webs.iiitd.edu.in/raghava/dbpred. DBPred is an effective approach to predict the DNA interacting residues in the protein using its primary structure.
Support the authors with ResearchCoin