Abstract The binary choice chamber (T-maze) assay has been used as a standard behavioural screen in Drosophila research to test mutants in terms of sensory discrimination skills and synaptic plasticity during memory consolidation and decay. Typically, ca. 100 individuals are tested as a group at the same time and the behavioural readout consists in counting the number of individuals in the testing tube exposed to a stimulus versus the number of flies in the control tube, with the normalized difference in fly count being defined as the batch preference index (PI). Unfortunately, the batch PI has widely been taken as a precise metric of group behaviour, to the point where ANOVA/t-tests have been considered the most powerful statistical tests for analyzing samples in terms of batch PI values, which has led to a hyperinflation of apparently very highly significant effects for small differences in PI values (e.g. -0.1 vs 0.1), leading to problems with replicability. Here it is shown on the basis of a well-established statistical model for binary decisions that application of the t-test to PI data implicitly assumes that each fly in a batch is an independent biological replicate in the extremely strict sense that the decision of each individual was interrogated independently of the other flies. Therefore, t-test analysis of PI data obtained with the intrinsically pseudoreplicative group assay is the cause of extremely optimistic P-values, as suggested recently by Bassetto et al. (2023) on the basis of effect size considerations for proportions. Thus, rather than using inferential statistics, PI data should be assessed on the basis of effect size. A more fundamental problem with the batch PI value is the uncertainty in whether it measures the mean individual preference and if so, with what precision, given the simple readout? This aspect is illustrated here by modelling distributions of flies in the T-maze, which suggest that the effective precision of the PI value is clearly worse than the nominal precision of +/- 0.01 for 100 flies, so that the batch PI can only serve as a rough indicator of group tendencies.