It is widely acknowledged in Germanic languages that segments are the primary planning units at the phonological encoding stage of spoken word production. Mixed results, however, have been found in Chinese, and it is still unclear what roles syllables and segments play in planning Chinese spoken word production. In the current study, participants were asked to first prepare and later produce disyllabic Mandarin words upon picture prompts and a response cue while electroencephalogram (EEG) signals were recorded. Each two consecutive pictures implicitly formed a pair of prime and target, whose names shared the same word-initial atonal syllable or the same word-initial segments, or were unrelated in the control conditions. Only syllable repetition induced significant effects on event-related brain potentials (ERPs) after target onset: a widely distributed positivity in the 200- to 400-ms interval and an anterior positivity in the 400- to 600-ms interval. We interpret these to reflect syllable-size representations at the phonological encoding and phonetic encoding stages. Our results provide the first electrophysiological evidence for the distinct role of syllables in producing Mandarin spoken words, supporting a language specificity hypothesis about the primary phonological units in spoken word production. Copyright © 2017 The Author(s).