Abstract Motivation Non-invasive prenatal testing (NIPT) is a powerful screening method for fetal aneuploidy detection, relying on laboratory and computational analysis of cell-free DNA. Although several published computational NIPT analysis tools are available, no comprehensive and direct accuracy comparison of these tools is published. Here, we evaluate and determine the precision of five commonly used computational NIPT aneuploidy analysis tools, considering diverse sequencing depth (coverage) and fetal DNA fraction (FF) on clinically validated NIPT samples. Methods We evaluated computational NIPT aneuploidy analysis tools WisecondorX, NIPTeR, NIPTmer, RAPIDR, and GIPseq, on the same set of clinically validated samples, subsampled to different sequencing coverages between 1.25–20M reads per sample (RPS). These clinically validated samples consisted of 423 samples, including 19 samples with fetal chromosome 21 trisomy (T21, Down syndrome), eight trisomy 18 (T18, Edwards syndrome) and three trisomy 13 (T13, Patau syndrome) samples. For each software and sequencing coverage, we determined the number of false-negative and false-positive trisomy/euploidy calls. For a uniform trisomy detection interpretation, we defined a framework based on the percent-point function for determining the cut-off threshold for calling aneuploidy based on the sample Z-score and the reference group Z-score distribution. We also determined the effect of the naturally occurring arbitrary read placement driven uncertainty on T21 detection at very low sequencing coverage and the effect of cell-free fetal DNA fraction (FF) on the accuracy of these computational tools in the case of various sequencing coverages. Results This is the first head-to-head comparison of NIPT aneuploidy detection tools for the low-coverage whole-genome sequencing approach. We determined that, with the currently available software tools, the minimum sequencing coverage with no false-negative trisomic cases was 5M RPS. Secondly, for these compared tools, the number of false-negative trisomic cases could be reduced if the trisomy call cut-off threshold considers the Z-score distribution of euploid reference samples. Thirdly, we observed that in the case of low FF, both aneuploidy Z-score and FF inference was considerably less accurate, especially in NIPT assays with 5M RPS or lower coverage. Conclusions We determined that all compared computational NIPT tools were affected by lower sequencing depth, resulting in systematically increasing the proportions of false-negative trisomy results as the sequencing depth decreased. Trisomy detection for lower coverage NIPT samples (e.g. 2.5M RPS) is technically possible but can increase the proportion of false-positive and false-negative trisomic cases, especially in the case of low FF.