Abstract To study brain function, preclinical research relies heavily on animal monitoring and the subsequent analyses of behavior. Commercial platforms have enabled semi high-throughput behavioral analyses by providing accurate tracking of animals, yet they often struggle with the analysis of ethologically relevant behaviors and lack the flexibility to adapt to variable testing environments. In the last couple of years, substantial advances in deep learning and machine vision have given researchers the ability to take behavioral analysis entirely into their own hands. Here, we directly compare the performance of commercially available platforms (Ethovision XT14, Noldus; TSE Multi Conditioning System, TSE Systems) to cross-verified human annotation. To this end, we provide a set of videos - carefully annotated by several human raters - of three widely used behavioral tests (open field, elevated plus maze, forced swim test). Using these data, we show that by combining deep learning-based motion tracking (DeepLabCut) with simple post-analysis, we can track animals in a range of classic behavioral tests at similar or even greater accuracy than commercial behavioral solutions. In addition, we integrate the tracking data from DeepLabCut with post analysis supervised machine learning approaches. This combination allows us to score ethologically relevant behaviors with similar accuracy to humans, the current gold standard, thus outperforming commercial solutions. Moreover, the resulting machine learning approach eliminates variation both within and between human annotators. In summary, our approach helps to improve the quality and accuracy of behavioral data, outperforming commercial systems at a fraction of the cost.