STEMM Institute Press
Science, Technology, Engineering, Management and Medicine
A Machine Learning Approach for Fetal Chromosome Abnormality Identification Based on Multi-Feature Fusion
DOI: https://doi.org/10.62517/jbdc.202501412
Author(s)
Yi Wang, Rouyi Fan, Yaning Yin, Tianyuan Liu*
Affiliation(s)
School of Artificial Intelligence and Big Data, Henan University of Technology, Zhengzhou, Henan, China *Corresponding Author
Abstract
One of the prenatal screening techniques employed by NIPT (Non-Invasive Prenatal Testing) is high-throughput sequencing of foetal cell-free DNA isolated from maternal peripheral blood. Due to its high sensitivity, potential for early diagnosis, and non-invasive nature, it has emerged as a crucial method for identifying chromosomal abnormalities in fetuses. The Y chromosome's concentration is a vital reference point for quality assessment and anomaly analysis, necessitating its presence for the identification of male foetal abnormalities. However, because the Y chromosome lacks a definitive marker, the only techniques available for identifying chromosomal defects in female embryos are multidimensional feature fusion analysis and the X chromosome. For the diagnosis of female foetal anomalies, this methodology creates serious gaps and difficulties in feature utilisation, model stability, and interpretability. This study utilizes regional NIPT data to develop a multi-feature fusion and machine learning-based approach for identifying chromosomal abnormalities in female fetuses. SMOTE was employed to address the class imbalance brought on by the lack of aberrant samples in the training dataset. A feature set comprising Z-scores, GC content, and read duration metrics was methodically created. The LightGBM model was used to identify foetal chromosome abnormalities in females. Experimental results demonstrate that LightGBM outperformed Random Forest, XGBoost, CatBoost, and logistic regression algorithms, achieving 78.99% accuracy, 82.29% precision, 78.99% recall, and an F1 score of 80.52% on the test set. The most important diagnostic characteristics are the chromosome 21 Z-score, the percentage of duplicated reads, and the GC content, according to SHAP analysis, which was used to improve clinical interpretability. This study closes a gap in the present NIPT technology systems for detecting female foetal anomalies and offers an efficient method for accurate, interpretable screening of female foetal chromosomal abnormalities.
Keywords
Non-Invasive Prenatal Testing; Female Foetal Anomaly Detection; Feature Engineering; LightGBM; SHAP Explainability
References
[1] Oyovwi S O M, Ohwin P E, Rotu A R, et al. Internet-Based Abnormal Chromosomal Diagnosis during Pregnancy Using a Noninvasive Innovative Approach to Detecting Chromosomal Abnormalities in the Fetus: Scoping Review. JMIR bioinformatics and biotechnology, 2024, 5e58439. [2] Zaiaeva, E. E.; Andreeva, E. N.; Demikova, N. S. Prevalence of Rare Chromosomal Anomalies Based on Epidemiological Monitoring of Congenital Malformations in the Moscow Oblast. Medical Genetics. 2021, 20 (7), 59–66. [3] Frisova V. Prenatal Screening for Chromosomal Defects. Reproductive Medicine, 2025, 6(2):15-15. [4] Jindal A, Sharma M, Karena ZV, Chaudhary C. Amniocentesis. 2023 Aug 14. In: StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2025 Jan–. PMID: 32644673. [5] Zheng L, Y in N, Wang M, et al. Comparative performance and health economic analysis of prenatal screening for down syndrome in Fujian province, China. Scientificreports, 2025, 15(1): 23940. DOI: 10.1038/S41598-025-08592-0. [6] Bulbul A G, Kirtis E, Kandemir H, et al. Is intermediaterisk really intermediate? Comparison of karyotype and non-invasive prenatal testing results of pregnancies at intermediate risk of trisomy 21 on maternal serum screening. Journal of genetic counseling, 2024, 34(2):e1973-e1973. [7] George K, Achilleas A, Michalis N, et al. Targeted capture enrichment followed by NGS: development and validation of a single comprehensive NIPT for chromosomal aneuploidies, microdeletion syndromes and monogenic diseases. Molecular cytogenetics, 2019, 12(1): 48. [8] Meij D V R K, Sistermans A E, Macville V M, et al. TRIDENT-2: National Implementation of Genome-wide Non-invasive Prenatal Testing as a First-Tier Screening Test in the Netherlands. The American Journal of Human Genetics, 2019, 105(6):1091-1101. [9] Yunyun Z, Shanning W, Yinghui D, et al. Clinical experience regarding the accuracy of NIPT in the detection of sex chromosome abnormality. The journal of gene medicine, 2020, 22(8):e3199. [10] Luo W, He B, Han D, et al. The clinical performance of foetal sex chromosome abnormalities in serum biochemical screening in the second trimester. Scientific Reports, 2024, 14(1):29011-29011. [11] Huang Q, Zhu J, Lu J, et al. A noninvasive prenatal test pipeline with a well-generalized machine-learning approach for accurate foetal trisomy detection using low-depth short sequence data. Expert Systems with Applications, 2024, 249(PC):123759-. [12] Xianfeng X, Liping W, Xiaohong C, et al. Machine learning-based evaluation of application value of the USM combined with NIPT in the diagnosis of foetal chromosomal abnormalities. Mathematical biosciences and engineering: MBE, 2022, 19(4): 4260-4276.
Copyright @ 2020-2035 STEMM Institute Press All Rights Reserved