A Comparative Study of PCA and KPCA for Groundwater Quality Index Estimation
Downloads
Groundwater quality assessment is crucial for ensuring human welfare and promoting sustainable economic development. This study evaluates the effectiveness of linear Principal Component Analysis (PCA) and nonlinear Kernel PCA (KPCA) in developing a reliable Groundwater Quality Index (GWQI) for Qena, Egypt. Using ten hydrochemical parameters from seventy-three groundwater samples, we compare the performance of four kernel functions within the KPCA framework. The PCA-based GWQI classified 71.0% of samples as suitable for irrigation, closely aligning with the Wilcox Diagram classification (76.7%). In contrast, KPCA with linear, polynomial, sigmoid, and radial basis function kernels yielded suitability rates of 58.9%, 52.1%, 63.0%, and 58.9%, respectively. These values are consistent with USSL (53.4%) and Na% (53.4%) classifications. Notably, the sigmoid kernel in KPCA demonstrated stronger correlations with Key hydrochemical parameters, effectively capturing nonlinear data structures. These findings underscore the importance of accounting for nonlinearity in groundwater quality assessment and demonstrate the potential of KPCA to improve GWQI accuracy. This comparative analysis highlights KPCA’s superiority over PCA for nonlinear datasets, providing enhanced tools for groundwater management and more reliable quality evaluations.
Downloads
[1] UNESCO. (2022). Groundwater, Making the invisible visible. The United Nations Educational, Scientific and Cultural Organization (UNESCO), Paris, France.
[2] R. K. Horton. (1965). An Index Number System for Rating Water Quality. Journal of the Water Pollution Control Federation, 37(3), 300–306.
[3] Lumb, A., Sharma, T. C., Bibeault, J.-F., & Klawunn, P. (2011). A Comparative Study of USA and Canadian Water Quality Index Models. Water Quality, Exposure and Health, 3(3–4), 203–216. doi:10.1007/s12403-011-0056-5.
[4] Hoseinzadeh, E., Khorsandi, H., Wei, C., & Alipour, M. (2015). Evaluation of Aydughmush River water quality using the National Sanitation Foundation Water Quality Index (NSFWQI), River Pollution Index (RPI), and Forestry Water Quality Index (FWQI). Desalination and Water Treatment, 54(11), 2994–3002. doi:10.1080/19443994.2014.913206.
[5] Chidiac, S., El Najjar, P., Ouaini, N., El Rayess, Y., & El Azzi, D. (2023). A comprehensive review of water quality indices (WQIs): history, models, attempts and perspectives. Reviews in Environmental Science and Biotechnology, 22(2), 349–395. doi:10.1007/s11157-023-09650-7.
[6] Patel, P. S., Pandya, D. M., & Shah, M. (2023). A systematic and comparative study of Water Quality Index (WQI) for groundwater quality analysis and assessment. Environmental Science and Pollution Research, 30(19), 54303–54323. doi:10.1007/s11356-023-25936-3.
[7] Mahanty, B., Lhamo, P., & Sahoo, N. K. (2023). Inconsistency of PCA-based water quality index – Does it reflect the quality? Science of the Total Environment, 866, 161353. doi:10.1016/j.scitotenv.2022.161353.
[8] Subba Rao, N., Das, R., Sahoo, H. K., & Gugulothu, S. (2024). Hydrochemical characterization and water quality perspectives for groundwater management for urban development. Groundwater for Sustainable Development, 24, 101071. doi:10.1016/j.gsd.2023.101071.
[9] Ali, S., Verma, S., Agarwal, M. B., Islam, R., Mehrotra, M., Deolia, R. K., Kumar, J., Singh, S., Mohammadi, A. A., Raj, D., Gupta, M. K., Dang, P., & Fattahi, M. (2024). Groundwater quality assessment using water quality index and principal component analysis in the Achnera block, Agra district, Uttar Pradesh, Northern India. Scientific Reports, 14(1), 5381. doi:10.1038/s41598-024-56056-8.
[10] Lehr, C., Dannowski, R., Kalettka, T., Merz, C., Schröder, B., Steidl, J., & Lischeid, G. (2018). Detecting dominant changes in irregularly sampled multivariate water quality data sets. Hydrology and Earth System Sciences, 22(8), 4401–4424. doi:10.5194/hess-22-4401-2018.
[11] Liu, H., Yang, J., Ye, M., James, S. C., Tang, Z., Dong, J., & Xing, T. (2021). Using t-distributed Stochastic Neighbor Embedding (t-SNE) for cluster analysis and spatial zone delineation of groundwater geochemistry data. Journal of Hydrology, 597, 126146. doi:10.1016/j.jhydrol.2021.126146.
[12] Liu, Y., Zheng, N., Yang, S., Liu, F., Liu, M., & Chen, Y. (2025). Identification of water pollution sources in the Daluxi River using kernel principal component analysis and gradient boosting decision tree. Environmental Earth Sciences, 84(9), 236. doi:10.1007/s12665-025-12241-0.
[13] Zhao, Y., & Chen, M. (2025). Prediction of river dissolved oxygen (DO) based on multi-source data and various machine learning coupling models. PLOS ONE, 20(3), e0319256. doi:10.1371/journal.pone.0319256.
[14] Liu, W., Wang, J., Li, Z., & Lu, Q. (2024). ISSA optimized spatiotemporal prediction model of dissolved oxygen for marine ranching integrating DAM and Bi-GRU. Frontiers in Marine Science, 11, 1473551. doi:10.3389/fmars.2024.1473551.
[15] Yu, T., Yang, S., Bai, Y., Gao, X., & Li, C. (2018). Inlet water quality forecasting of wastewater treatment based on kernel principal component analysis and an extreme learning machine. Water (Switzerland), 10(7), 873. doi:10.3390/w10070873.
[16] Abba, S. I., Pham, Q. B., Usman, A. G., Linh, N. T. T., Aliyu, D. S., Nguyen, Q., & Bach, Q. V. (2020). Emerging evolutionary algorithm integrated with kernel principal component analysis for modeling the performance of a water treatment plant. Journal of Water Process Engineering, 33, 101081. doi:10.1016/j.jwpe.2019.101081.
[17] Zhang, Y. F., Fitch, P., & Thorburn, P. J. (2020). Predicting the trend of dissolved oxygen based on the kPCA-RNN model. Water (Switzerland), 12(2), 585. doi:10.3390/w12020585.
[18] Jibrin, A. M., Al-Suwaiyan, M., Yaseen, Z. M., & Abba, S. I. (2025). New perspective on density-based spatial clustering of applications with noise for groundwater assessment. Journal of Hydrology, 661, 133566. doi:10.1016/j.jhydrol.2025.133566.
[19] El-Rawy, M., Ismail, E., & Abdalla, O. (2019). Assessment of groundwater quality using GIS, hydrogeochemistry, and factor statistical analysis in Qena governorate, Egypt. Desalination and Water Treatment, 162, 14–29. doi:10.5004/dwt.2019.24423.
[20] Said, R. (2017). The Geology of Egypt. Routledge, Milton Park, United Kingdom. doi:10.1201/9780203736678.
[21] El-Belsasy, M. I. (1994). Quaternary geology of some selected drainage basins in upper Egypt (Qena-Idfu area). Ph.D. Thesis, Cairo University, Faculty of Science, Giza, Egypt.
[22] Beshr, A. M., Kamel Mohamed, A., ElGalladi, A., Gaber, A., & El-Baz, F. (2021). Structural characteristics of the Qena Bend of the Egyptian Nile River, using remote-sensing and geophysics. The Egyptian Journal of Remote Sensing and Space Science, 24(3), 999–1011. doi:10.1016/j.ejrs.2021.11.005.
[23] Ayers, R. S., & Westcot, D. W. (1985). Water quality for agriculture. Food and agriculture organization of the United Nations, Rome, Italy.
[24] Yan, S., Zhang, T., Zhang, B., Zhang, T., Cheng, Y., Wang, C., Luo, M., Feng, H., & Siddique, K. H. M. (2023). The higher relative concentration of K+ to Na+ in saline water improves soil hydraulic conductivity, salt-leaching efficiency and structural stability. Soil, 9(1), 339–349. doi:10.5194/soil-9-339-2023.
[25] Yu, J., Shi, J. G., Ma, X., Dang, P. F., Yan, Y. L., Mamedov, A. I., Shainberg, I., & Levy, G. J. (2017). Superabsorbent Polymer Properties and Concentration Effects on Water Retention under Drying Conditions. Soil Science Society of America Journal, 81(4), 889–901. doi:10.2136/sssaj2016.07.0231.
[26] Missaoui, R., Ncibi, K., Abdelkarim, B., Bouajila, A., Choura, A., Hamdi, M., & Hamed, Y. (2023). Assessment of hydrogeochemical characteristics of groundwater: link of AHP and PCA methods using a GIS approach in a semi-arid region, Central Tunisia. Euro-Mediterranean Journal for Environmental Integration, 8(1), 99–114. doi:10.1007/s41207-023-00345-7.
[27] Kaiser, H. F. (1960). The Application of Electronic Computers to Factor Analysis. Educational and Psychological Measurement, 20(1), 141–151. doi:10.1177/001316446002000116.
[28] Satour, N., Benyacoub, B., El Mahrad, B., & Kacimi, I. (2021). KPCA over PCA to assess urban resilience to floods. E3S Web of Conferences, 314, 03005. doi:10.1051/e3sconf/202131403005.
[29] Wang, Z., van der Laan, T., & Usman, M. (2025). Self‐Adaptive Quantum Kernel Principal Component Analysis for Compact Readout of Chemiresistive Sensor Arrays. Advanced Science, 12(15), 202411573. doi:10.1002/advs.202411573.
[30] Lajmi, F., Mhamdi, L., Abdelbaki, W., Dhouibi, H., & Younes, K. (2023). Investigating Machine Learning and Control Theory Approaches for Process Fault Detection: A Comparative Study of KPCA and the Observer-Based Method. Sensors, 23(15), 6899. doi:10.3390/s23156899.
[31] Jalili, A., Saleki, Z., Luo, Y. A., Pan, F., Chen, A. X., & Draayer, J. P. (2025). Performance of various kernel functions for mass prediction with support vector machine. European Physical Journal A, 61(6), 143. doi:10.1140/epja/s10050-025-01610-9.
[32] Saha, S., Das, M., Mondal, B. S., Sarkar, S., & Maiti, J. (2021). DiPSVM: A Polynomial Kernel-free Support Vector Machine. 2021 International Conference on Data Analytics for Business and Industry (ICDABI), 448–452. doi:10.1109/icdabi53623.2021.9655976.
[33] Mustafa, B. (2024). New developments and applications of radial basis functions in interpolation, approximation and data science. Ph.D. Thesis, University of Granada Faculty of Science, Granada, Spain.
[34] 34-Jiang, H. J., You, Z. H., & Huang, Y. A. (2019). Predicting drug-disease associations via sigmoid kernel-based convolutional neural networks. Journal of Translational Medicine, 17(1), 382. doi:10.1186/s12967-019-2127-5.
[35] Richards, L. A. (1954). Diagnosis and Improvement of Saline and Alkali Soils. Soil Science, 78(2), 154. doi:10.1097/00010694-195408000-00012.
[36] Todd, D. and Mays, L. (2005) Groundwater Hydrology (3rd Ed.). John Wiley and Sons, Hoboken, New Jersey.
[37] Wilcox, L.V. (1955) Classification and Use of Irrigation Water. US Department of Agriculture, Circular 969, Washington, United States.
[38] Szabolcs, I. (1964). The influence of irrigation water of high sodium carbonate content on soils. Agrokémia és talajtan, 13(SUP), 237-246.
[39] Doneen, L. D. (1962). The influence of crop and soil on percolating water. Proc. 1961 Biennial conference on Groundwater recharge.
[40] Dregne, H. E. (1952). Alkali Soils, their Formation, Properties, and Reclamation. Agronomy Journal, 44(6), 339–339. doi:10.2134/agronj1952.00021962004400060019x.
[41] Teng, W. C., Fong, K. L., Shenkar, D., Wilson, J. A., & Foo, D. C. Y. (2016). Piper diagram – A novel visualisation tool for process design. Chemical Engineering Research and Design, 112, 132–145. doi:10.1016/j.cherd.2016.06.002.
[42] Gibbs, R. J. (1970). Mechanisms controlling world water chemistry. Science, 170(3962), 1088–1090. doi:10.1126/science.170.3962.1088.
[43] Wilcox, L.V. (1948) The Quality of Water for Agricultural Use. US Department of Agriculture, Technical Bulletin, Washington, United States.
[44] Bisong, E. (2019). Introduction to Scikit-learn. In: Building Machine Learning and Deep Learning Models on Google Cloud Platform. Apress, Berkeley, CA, United States. doi:10.1007/978-1-4842-4470-8_18.
[45] Dawoud, M. A., & Ismail, S. S. (2013). Saturated and unsaturated River Nile/groundwater aquifer interaction systems in the Nile Valley, Egypt. Arabian Journal of Geosciences, 6(6), 2119–2130. doi:10.1007/s12517-011-0483-4.
[46] Ni, J., Ma, H., & Ren, L. (2012). A time-series forecasting approach based on KPCA-LSSVM for lake water pollution. 9th International Conference on Fuzzy Systems and Knowledge Discovery, 1044–1048. doi:10.1109/fskd.2012.6234207.
- Authors retain all copyrights. It is noticeable that authors will not be forced to sign any copyright transfer agreements.
- This work (including HTML and PDF Files) is licensed under a Creative Commons Attribution 4.0 International License.![]()















