Task episodes consist of sequences of steps that are performed to achieve a goal. We used fMRI to examine neural representation of task identity, component items, and sequential position, focusing on two major cortical systems – the multiple-demand (MD) and default mode networks (DMN). Human participants (20 male, 22 female) learned six tasks each consisting of four steps. Inside the scanner, participants were cued which task to perform and then sequentially identified the target item of each step in the correct order. Univariate time-course analyses indicated that intra-episode progress was tracked by a tonically increasing global response, plus an increasing phasic step response specific to MD regions. Inter-episode boundaries evoked a widespread response at episode onset, plus a marked offset response specific to DMN regions. Representational similarity analysis was used to examine encoding of task identity and component steps. Both networks represented the content and position of individual steps, but the DMN preferentially represented task identity while the MD network preferentially represented step-level information. Thus, although both DMN and MD networks are sensitive to step-level and episode-level information in the context of hierarchical task performance, they exhibit dissociable profiles in terms of both temporal dynamics and representational content. The results suggest collaboration of multiple brain regions in control of multi-step behavior, with MD regions particularly involved in processing the detail of individual steps, and DMN adding representation of broad task context.Significance Statement Achieving one’s goals requires knowing what to do and when. Tasks are typically hierarchical, with smaller steps nested within overarching goals. For effective, flexible behavior, the brain must represent both levels. We contrast response time-courses and information content of two major cortical systems – the multiple-demand (MD) and default mode networks (DMN) – during multi-step task episodes. Both networks are sensitive to step-level and episode-level information, but with dissociable profiles. Intra-episode progress is tracked by tonically increasing global responses, plus MD-specific increasing phasic step responses. Inter-episode boundaries evoke widespread responses at episode onset, plus DMN-specific offset responses. Both networks encode content and position of individual steps, but the DMN and MD networks favor task identity and step-level information respectively.