Cognitive Theory of Multimedia Learning

The cognitive theory of multimedia learning (CTML) founded by Richard E. Mayer is a comprehensive framework for designing effective multimedia learning materials in relation to how the human mind processes information. This theory is based on the multimedia principle which claims that pictures, compared to text alone, allow learners to construct deeper knowledge. The CTML framework provides research-based insights into how students learn from a combination of sources and guides educators towards making informed design decisions.

CTML is based on three underlying assumptions from cognitive science: dual-channels, limited capacity, and active processing.

The dual-channel assumption presumes that humans have two separate channels for processing visual and auditory information and learning is enhanced when both channels are used together. The rationale for favouring multimedia instruction is the ability to take advantage of the brain’s full capacity for information processing.

The limited-capacity assumption suggests that there is a limited amount of information that humans can process in each channel at a given time. A learner’s working memory can hold a few images in the visual channel along with a few words in the verbal channel. Therefore, learners must make decisions about the importance of each piece of incoming information in order to draw upon prior knowledge and build connections.

The active processing assumption explains that the learner must actively apply cognitive processes when viewing and hearing information for learning to occur. In multimedia design, the material presented should have a coherent structure and guide the learner in understanding how to build that structure and gain an understanding of the material.

As learners take in visual and auditory information, their brains work to create mental models to allow them to understand and remember. Figure 1 illustrates the cognitive theory of multimedia learning and describes how the human mind filters, selects, and integrates information. Auditory information, such as narration and music, enters the auditory channel in the sensory memory, while images, text and diagrams enter the visual channel. This information is held for a brief moment before the learner works with the information, selecting relevant information to actively process in their working memory. The learner organizes separate verbal and visual models that help the learner understand what is being presented. Finally, this new information is integrated with prior knowledge and experiences to combine into knowledge that is stored in long-term memory.

The Cognitive Theory of Multimedia Learning image by Kati Dreilich. Adapted from Figure 8 Cognitive theory of multimedia learning p. 15, from  Mayer, R. E. (2024). The Past, Present, and Future of the Cognitive Theory of Multimedia Learning. Educational Psychology Review, 36(1), 8. https://doi.org/10.1007/s10648-023-09842-1

The instructional techniques and design principles presented in CTML can be summarized by three primary goals: reducing extraneous processing, managing essential processing, and fostering generative processing. Extraneous processing describes cognitive processing that is not relevant to the instructional goal and does not produce newly constructed, useful knowledge. Extraneous processing is caused by poor instructional design containing extraneous visuals or auditory information. Essential processing, however, is the cognitive processing required to represent the presented material in working memory. One must select the most essential visual and verbal information to work with, therefore, the amount of essential processing is impacted by the amount and complexity of the material and pace at which it is delivered. Making sense of the information and forming a mental model is the act of generative processing, which is influenced by the learner’s motivation to understand the lesson. Educators need to be aware of these three demands on cognitive capacity so that they can design lessons that eliminate unnecessary processing that is required for essential and generative processing in working memory.  

The cognitive theory of multimedia learning offers practical applications that instructors can put into practice to guide their instructional design by following these evidence-based principles.

GoalCTML PrincipleDescription of Principle
Minimize extraneous processingCoherence principleRemove unnecessary words, pictures and sounds.
Signaling principleUse cues to draw attention to key points.
Redundancy principleNarration needs to be in balance with printed text as redundancy can lead to cognitive overload.
Spatial contiguity principlePrinted text needs to be placed near corresponding graphic.
Temporal contiguity principlePresent visual and audio information simultaneously.
Manage essential processingSegmenting principleBreak material into learner-controlled segments.
Pre-training principlePresent names and characteristics of key information beforehand.
Modality principlePresent graphics and narration in favour of printed text.
Foster generative processingMultimedia principlePeople learn better from words and pictures than from words alone
Personalization principleUse a conversational style of language in narration. 
Voice principleNarration should be spoken in a human voice, rather than a robotic voice.
Embodiment principleOnscreen characters should demonstrate humanlike behaviours.

The cognitive theory of multimedia learning offers a comprehensive framework for designing multimedia instructional materials. By taking a learner-centered approach, rather than technology-centered approach, and recognizing the dual channel, limited capacity, and active processing assumptions, educators can create materials that align with how the human mind processes information. Through principles that address reducing extraneous processing, managing essential processing and fostering generative processing, CTML provides a practical framework for educators to design multimedia learning materials.


Mayer, R. E., & Fiorella, L. (Eds.). (2022). The Cambridge Handbook of Multimedia Learning (3rd ed.). Cambridge University Press. https://doi.org/10.1017/9781108894333.008