Main Article Content
Abstract
Type 2 diabetes mellitus (T2DM) is a metabolic disorder primarily driven by insulin resistance, involving complex genetic regulation. Understanding the molecular mechanisms underlying insulin resistance is crucial for identifying therapeutic targets. This study compared the performance of two biclustering algorithms, factor analysis for bicluster acquisition (FABIA) and the Cheng and Church algorithm (CCA), in analyzing gene expression data associated with insulin resistance. Using the GSE19420 dataset, simulated missing values were introduced to evaluate the robustness of both methods. Results showed that CCA consistently achieved lower mean squared error (MSE) in reconstructing gene expression patterns, suggesting higher accuracy in capturing co-expression structures. Nevertheless, FABIA effectively detected sparse, biologically relevant clusters. Notably, key genes such as MYO5B, DLG2, AXIN2, and PTK7 were identified within the biclusters, supporting their involvement in insulin signaling and metabolic regulation. These findings underscore the need to select biclustering methods that align with specific analytical goals and offer insights into gene networks involved in insulin resistance.
Keywords
Article Details
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
References
- American Diabetes Association, “Standards of medical care in diabetes—2022 abridged for primary care providers,” Clin. Diabetes, vol. 40, no. 1, pp. 10–38, 2022, doi: 10.2337/cd22-as01.
- M.O. Goodarzi et al., “Classification of type 2 diabetes genetic variants and a novel genetic risk score association with insulin clearance,” J. Clin. Endocrinol. Metabolism, vol. 105, no. 4, pp. 1251–1260, Apr. 2020, doi: 10.1210/clinem/dgz198.
- A. Mahmoud and A. Mohammed, “A survey on deep learning for time-series forecasting,” in Machine Learning and Big Data Analytics Paradigms: Analysis, Applications and Challenges, A.E. Hassanien and A. Darwish, Eds., Cham, Switzerland: Springer, 2021, pp. 365–392, doi: 10.1007/978-3-030-59338-4_19.
- S. Hochreiter et al., “FABIA: factor analysis for bicluster acquisition,” Bioinformatics, vol. 26, no. 12, pp. 1520–1527, Apr. 2010, doi: 10.1093/bioinformatics/btq227.
- Y. Cheng and G.M. Church, “Biclustering of expression data,” in Proc. 8th Int. Conf. Intell. Syst. Mol. Biol., 2000, pp. 93–103.
- B. Pontes, R. Giráldez, and J.S. Aguilar-Ruiz, “Biclustering on expression data: A review,” J. Biomed. Inform., vol. 57, pp. 163–180, Oct. 2015, doi: 10.1016/j.jbi.2015.07.003.
- Breast Cancer Association Consortium, “Breast cancer risk genes—association analysis in more than 113,000 women,” New Engl. J. Med., vol. 384, no. 5, pp. 428–439, Feb. 2021, doi: 10.1056/NEJMoa1913948.
- M.I. Love, A.M. Bush, L.H. Chen, S.K. Patel, A.J. Cutler, and J.D. Cooper, “Large-scale genomic analyses reveal insights into pleiotropy across traits,” Nat. Commun., vol. 13, Jun. 2022, Art. no. 3428, doi: 10.1038/s41467-022-30678-w.
- S.C. Madeira and A.L. Oliveira, “Biclustering algorithms for biological data analysis: a survey,” IEEE/ACM Trans. Comput. Biol. Bioinform., vol. 1, no. 1, pp. 24–45, Mar.–Jun. 2004, doi: 10.1109/TCBB.2004.2.
- T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed. New York, NY, USA: Springer, 2019.
- A. Prelić et al., “A systematic comparison and evaluation of biclustering methods for gene expression data,” Bioinformatics, vol. 22, no. 9, pp. 1122–1129, May 2006, doi: 10.1093/bioinformatics/btl060.
- M.G. Rahman and M.Z. Islam, “Missing value imputation using decision trees and decision forests by splitting and merging records: Two novel techniques,” Knowl.-Based Syst., vol. 53, pp. 51–65, 2013, doi: 10.1016/j.knosys.2013.08.023.
- T. Siswantining, A.E. Aminanto, D. Sarwinda, and O. Swasti, “Biclustering analysis using plaid model on gene expression data of colon cancer,” Austrian J. Stat., vol. 50, no. 5, pp. 101–114, Aug. 2021, doi: 10.17713/ajs.v50i5.1195
- T. Siswantining, D. Rahmawati, S. ‘Uyun, and A.Z. Arifin, “Biclustering of diabetic nephropathy and diabetic retinopathy microarray data using a similarity-based biclustering algorithm,” Int. J. Bioinform. Res. Appl., vol. 17, no. 4, pp. 343–362, 2021, doi: 10.1504/IJBRA.2021.117934.
- O. Troyanskaya et al., “Missing value estimation methods for DNA microarrays,” Bioinformatics, vol. 17, no. 6, pp. 520–525, Jun. 2001, doi: 10.1093/bioinformatics/17.6.520.
- I. Bitan-Roch, D. Levin, and D. Mahgereftekhari, “Imputation of missing PM2.5 observations in a network of air quality monitoring stations by a new k-NN method,” Atmosphere, vol. 13, no. 11, Nov. 2022, Art. no.1934, doi: 10.3390/atmos13111934.
- G. Gan, C. Ma, and J. Wu, Data Clustering: Theory, Algorithms, and Applications. Philadelphia, PA, USA: SIAM, 2007.
- H. Cho, I.S. Dhillon, Y. Guan, and S. Sra, “Minimum sum-squared residue co-clustering of gene expression data,” in Proc. SIAM Int. Conf. Data Mining, 2004, pp. 114–125, doi: 10.1137/1.9781611972740.11.
References
American Diabetes Association, “Standards of medical care in diabetes—2022 abridged for primary care providers,” Clin. Diabetes, vol. 40, no. 1, pp. 10–38, 2022, doi: 10.2337/cd22-as01.
M.O. Goodarzi et al., “Classification of type 2 diabetes genetic variants and a novel genetic risk score association with insulin clearance,” J. Clin. Endocrinol. Metabolism, vol. 105, no. 4, pp. 1251–1260, Apr. 2020, doi: 10.1210/clinem/dgz198.
A. Mahmoud and A. Mohammed, “A survey on deep learning for time-series forecasting,” in Machine Learning and Big Data Analytics Paradigms: Analysis, Applications and Challenges, A.E. Hassanien and A. Darwish, Eds., Cham, Switzerland: Springer, 2021, pp. 365–392, doi: 10.1007/978-3-030-59338-4_19.
S. Hochreiter et al., “FABIA: factor analysis for bicluster acquisition,” Bioinformatics, vol. 26, no. 12, pp. 1520–1527, Apr. 2010, doi: 10.1093/bioinformatics/btq227.
Y. Cheng and G.M. Church, “Biclustering of expression data,” in Proc. 8th Int. Conf. Intell. Syst. Mol. Biol., 2000, pp. 93–103.
B. Pontes, R. Giráldez, and J.S. Aguilar-Ruiz, “Biclustering on expression data: A review,” J. Biomed. Inform., vol. 57, pp. 163–180, Oct. 2015, doi: 10.1016/j.jbi.2015.07.003.
Breast Cancer Association Consortium, “Breast cancer risk genes—association analysis in more than 113,000 women,” New Engl. J. Med., vol. 384, no. 5, pp. 428–439, Feb. 2021, doi: 10.1056/NEJMoa1913948.
M.I. Love, A.M. Bush, L.H. Chen, S.K. Patel, A.J. Cutler, and J.D. Cooper, “Large-scale genomic analyses reveal insights into pleiotropy across traits,” Nat. Commun., vol. 13, Jun. 2022, Art. no. 3428, doi: 10.1038/s41467-022-30678-w.
S.C. Madeira and A.L. Oliveira, “Biclustering algorithms for biological data analysis: a survey,” IEEE/ACM Trans. Comput. Biol. Bioinform., vol. 1, no. 1, pp. 24–45, Mar.–Jun. 2004, doi: 10.1109/TCBB.2004.2.
T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed. New York, NY, USA: Springer, 2019.
A. Prelić et al., “A systematic comparison and evaluation of biclustering methods for gene expression data,” Bioinformatics, vol. 22, no. 9, pp. 1122–1129, May 2006, doi: 10.1093/bioinformatics/btl060.
M.G. Rahman and M.Z. Islam, “Missing value imputation using decision trees and decision forests by splitting and merging records: Two novel techniques,” Knowl.-Based Syst., vol. 53, pp. 51–65, 2013, doi: 10.1016/j.knosys.2013.08.023.
T. Siswantining, A.E. Aminanto, D. Sarwinda, and O. Swasti, “Biclustering analysis using plaid model on gene expression data of colon cancer,” Austrian J. Stat., vol. 50, no. 5, pp. 101–114, Aug. 2021, doi: 10.17713/ajs.v50i5.1195
T. Siswantining, D. Rahmawati, S. ‘Uyun, and A.Z. Arifin, “Biclustering of diabetic nephropathy and diabetic retinopathy microarray data using a similarity-based biclustering algorithm,” Int. J. Bioinform. Res. Appl., vol. 17, no. 4, pp. 343–362, 2021, doi: 10.1504/IJBRA.2021.117934.
O. Troyanskaya et al., “Missing value estimation methods for DNA microarrays,” Bioinformatics, vol. 17, no. 6, pp. 520–525, Jun. 2001, doi: 10.1093/bioinformatics/17.6.520.
I. Bitan-Roch, D. Levin, and D. Mahgereftekhari, “Imputation of missing PM2.5 observations in a network of air quality monitoring stations by a new k-NN method,” Atmosphere, vol. 13, no. 11, Nov. 2022, Art. no.1934, doi: 10.3390/atmos13111934.
G. Gan, C. Ma, and J. Wu, Data Clustering: Theory, Algorithms, and Applications. Philadelphia, PA, USA: SIAM, 2007.
H. Cho, I.S. Dhillon, Y. Guan, and S. Sra, “Minimum sum-squared residue co-clustering of gene expression data,” in Proc. SIAM Int. Conf. Data Mining, 2004, pp. 114–125, doi: 10.1137/1.9781611972740.11.
