Boosting Prediction of Protein-Protein Interactions using Word Embedding Techniques

Tuong Tri Nguyen

doi:10.26459/hueunijtt.v132i2B.7084

Vol. 132 No. 2B (2023), Research Articles

Vol. 132 No. 2B (2023)

Boosting Prediction of Protein-Protein Interactions using Word Embedding Techniques

Research Articles

https://doi.org/10.26459/hueunijtt.v132i2B.7084

Published 2023-12-30

Tran Hoai Nhan, Bui Duc Hanh, Nguyen Phuc Xuan Quynh, Truong Khanh Duy, Nguyen Tuong Tri*

Tran Hoai Nhan, Bui Duc Hanh, Nguyen Phuc Xuan Quynh, Truong Khanh Duy, Nguyen Tuong Tri*

https://orcid.org/0000-0002-1379-0131

PDF

Abstract

Understanding protein-protein interactions (PPIs) helps to identify protein functions and develop other important applications such as drug preparation, protein-disease relationship identification. Machine learning methods have been developed for the PPI prediction task in order to reduce the cost and time of previous experimental methods. In this paper, we study a method for determining PPIs using deep learning and protein sequence representation learning. In our method, an word embedding technique is utilized for protein sequence representation learning. This technique captures the semantic relationship between amino acids in protein sequences. The semantic relationship is then used as the input information, which is fed into a neural network to help recognize the interaction signature of the input protein pair. Different from previous studies, we integrate the protein sequence embedding mechanism into a neural network model. Thereby, the protein sequence embedding is better controlled for PPI prediction by our neural network model. We evaluate our method on benchmark datasets including Yeast, Human, and eight different independent sets. In addition, we also conduct an extensive comparison with the other existing methods. Our results show that the proposed method is superior to other existing methods and achieves high efficiency in predicting cross-species PPIs. The dataset and our source code are available at https://github.com/thnhub/BoostPPIP.git.

https://doi.org/10.26459/hueunijtt.v132i2B.7084

PDF

This work is licensed under a Creative Commons Attribution 4.0 International License.