Emerging evidence has indicated infants’ early sensitivity to acoustic cues in music. Do they interpret these cues in emotional terms to represent others’ affective states? The present study examined infants’ development of emotional understanding of music with a violation-of-expectation paradigm. Twelve- and 20-month-olds were presented with emotionally concordant and discordant music-face displays on alternate trials. The 20-month-olds, but not the 12-month-olds, were surprised by emotional incongruence between musical and facial expressions, suggesting their sensitivity to musical emotion. In a separate non-music task, only the 20-month-olds were able to use an actress’s affective facial displays to predict her subsequent action. Interestingly, for the 20-month-olds, such emotion-action understanding correlated with sensitivity to musical expressions measured in the first task. These two abilities however did not correlate with family income, parental estimation of language and communicative skills, and quality of parent-child interaction. The findings suggest that sensitivity to musical emotion and emotion-action understanding may be supported by a generalised common capacity to represent emotion from social cues, which lays a foundation for later social-communicative development. © 2017 Siu, Cheung.