Abstract The forms of many species’ vocal signals are shaped by their functions 1–15 . In humans, a salient context of vocal signaling is infant care, as human infants are altricial 16, 17 . Humans often alter their vocalizations to produce “parentese”, speech and song produced for infants that differ acoustically from ordinary speech and song 18–35 in fashions that have been proposed to support parent-infant communication and infant language learning 36–39 ; modulate infant affect 33, 40–45 ; and/or coordinate communicative interactions with infants 46–48 . These theories predict a form-function link in infant-directed vocalizations, with consistent acoustic differences between infant-directed and adult-directed vocalizations across cultures. Some evidence supports this prediction 23, 27, 28, 32, 49–52 , but the limited generalizability of individual ethnographic reports and laboratory experiments 53 and small stimulus sets 54 , along with intriguing reports of counterexamples 55–62 , leave the question open. Here, we show that people alter the acoustic forms of their vocalizations in a consistent fashion across cultures when speaking or singing to infants. We collected 1,615 recordings of infant- and adult-directed singing and speech produced by 410 people living in 21 urban, rural, and small-scale societies, and analyzed their acoustic forms. We found cross-culturally robust regularities in the acoustics of infant-directed vocalizations, such that infant-directed speech and song were reliably classified from acoustic features found across the 21 societies studied. The acoustic profiles of infant-directedness differed across language and music, but in a consistent fashion worldwide. In a secondary analysis, we studied whether listeners are sensitive to these acoustic features, playing the recordings to 51,065 people recruited online, from many countries, who guessed whether each vocalization was infant-directed. Their intuitions were largely accurate, predictable in part by acoustic features of the recordings, and robust to the effects of linguistic relatedness between vocalizer and listener. By uniting rich cross-cultural data with computational methods, we show links between the production of vocalizations and cross-species principles of bioacoustics, informing hypotheses of the psychological functions and evolution of human communication.