ATD Blog

Steering Away from Cognitive Overload

Thursday, April 30, 2015

Trying to learn important information through multimedia can feel like driving through a strange city for a big job interview. 

As the learning designer, you’re not the one heading to the interview, but you do select the car and choose the route. You give directions, erect street signs, string up the traffic signals, and so forth. Whatever roads and vehicles the driver uses and sees, you put there. And if you make mistakes, the driver won’t arrive on time or do well in the interview. 

With that happy thought, let’s discuss how to manage cognitive load. In other words, let’s address the question: “How can you reduce unnecessary demands on working memory and maximize a learner’s chances of success?” 

Research suggests—with a very loud “Ahem!”—that multimedia razzle-dazzle can actually work against effective learning. Even background music can interfere with success, the way sound from the car radio makes it harder for you to navigate through a work zone. In “Nine Ways to Reduce Cognitive Load in Multimedia Learning,” Richard E. Mayer and Roxana Moreno explain what they mean by cognitive load and offer a three-part theory for how to make information meaningful. 

Meaningful Learning: How it Happens, How it Doesn’t 

Multimedia learning, according to Mayer and Moreno, involves delivering information through words (printed or spoken) and images (drawings, photos, animations, videos). By “meaningful learning,” they mean you’re able to apply that information to a new situation.

How that happens, they say, is affected by three factors: 

  1. The dual-channel assumption says that we handle incoming information through two channels: one for words and one for pictures.
  2. The active processing assumption says that we need to do significant mental work in order to learn. We decide what to pay more attention to, and for things that make the cut, figure out what they mean and how they interact. This processing is how we create a mental construct for what we’re learning, and then connect it to existing knowledge.
  3. The limited capacity assumption says that we can only work with so much at a time in a cognitive channel. We can only handle so many words, so many images. 

Assuming you’ve done some active processing with those factors, you can already see the implications. Learning is challenging enough; how we present information through words and images can help or hinder more than you might realize. Mayer and Moreno also identify five ways overload can happen, and they present strategies to overload. 
Overload From Too Much in a Single Channel 

Imagine that a section of your multimedia lesson has most of its information in a single channel, like a large block of text. And it’s all necessary information. If you add audio to reinforce the print, you may actually overload the verbal channel. For example, when explaining the difference between defined-contribution and defined-benefit pensions in both text and audio, you’re asking the learner to read and listen at the same time. The two streams of words compete with each other for working-memory resources. 

Assuming all the information is relevant, Mayer and Moreno suggest off-loading some content: move some of it from verbal to visual. Use images to anchor key concepts, reduce the printed text, and let the audio channel carry the message. “Students understand a multimedia explanation better when the words are presented as narration rather than as on screen text,” write Mayer and Moreno. 

Remember our interview candidate? She moves more smoothly through traffic with a GPS that combines narrated directions with a graphic map—far more so than if she had highly detailed, text-only directions. 

What happened in the researchers’ experiments? One way to express the strength of an outcome is through “effect size.” Using one common measure, Cohen’s d, an effect size of 0.1 – 0.3 would be small, 0.3 – 0.5 would be moderate, and greater than 0.5 would be significant. In six experiments involving offloading, Mayer and Moreno report an effect size of 1.17. 

Overload in Both Channels 

What if both the verbal and the image channels have too much essential information? No matter how much you need to cover or how elegant the presentation, too much is too much. When the learner can’t process everything, she can’t organize the input into a useful mental model, let alone integrate it with what she already knows. 


Again, our driver trying to make the interview can’t easily cope simultaneously with a nagging GPS, unfamiliar street signs, shifting traffic, and a message board displaying cryptic data about a detour—even though it’s all important. 

Mayer and Moreno offer two solutions. One is to segment content; break material into smaller pieces, and allow the learner to decide when to move on. An experiment broke a three-minute segment into 16 segments, linked by CONTINUE buttons. Compared with a control group, students who could choose when to continue, thus taking the time they wanted with the current segment, performed substantially better.

When segmenting won’t work, a second solution is to offer pre-training, which means providing some information ahead of time, such as the names or functions of major parts. In order to build a mental model of what you’re learning, you need a component model (how each major part works), and a causal model (how the parts affect each other). Pre-training gets you to the component model faster so it’s easier to construct your causal model. 

Suppose our interview candidate has traveled to Washington, D.C. Before she gets her car, she might learn the different names for the most important freeway (I-495, I-95, the Beltway) and the meaning of “Inner Loop” and “Outer Loop.” That could help her negotiate the trip from Dulles airport to Bethesda. 

Overload from Extraneous Information 

(Spoiler Alert: “Nice to know” doesn’t mean “good to include.”) 

Mayer and Moreno point out that “interesting but extraneous material” takes up cognitive capacity. The learner has to pay some attention—for instance, it’s hard to not listen to background music. Effort goes into deciding whether anything deserves further attention. The more this happens, the less capacity remains for learning what actually does matter. 

You probably can guess what the researchers recommend: weeding. Remove the extraneous. What’s the bare minimum that people need to know in order to accomplish the skill or apply the knowledge? Force everything else to justify its inclusion. 

In an animated sales-call lesson, for example, I don’t need to see the customer driving in. I don’t need an animated phone, virtual pens, and virtual paper clips. I do need a customer statement to respond to. I need time to analyze it. I need clear examples of responses and how effective they are in a situation like the one I’m seeing. 

To me, the weeding of nonessential material is the difference between the rich but irrelevant detail of a war story and the crisp relevance of a pertinent example. Our interview candidate probably doesn’t need to know that there’s a library two blocks before she gets to Midcounty Highway; she does need to know when she gets there, the two right lanes are right-turn-only. 

Granted, sometimes you can’t edit details out. Suppose you’re explaining how to operate packaging machinery in a pharmaceutical plant. Your learner will confront lots of equipment and lots of steps, along with potentially overwhelming detail in the video close-ups.   

When weeding is not an option, Mayer and Moreno recommend is signaling—providing cues to the learner about how to organize the material. So, the lesson might start by breaking packaging into four stages: product into plastic blisters, blisters into cardboard wallets, wallets into carton packs, cartons into cases. In subsequent lessons, arrows or similar highlighting emphasize key components of each stage. 


Overload from Poor Presentation 

Sometimes overload results from the confusing presentation of essential information. Imagine an animation in one part of a screen and related text in another. The learner has to shift focus between the two areas, as well as figure out which parts are related to which.

Mayer and Moreno recommend closer alignment of words and pictures. Placing text inside a graphic, rather than alongside as a caption, aligns the explanation more closely with the visual for what’s being explained. 

In a related situation, information arrives as animation, onscreen text, and audio narration. The simultaneous presentation of text and narration, which the researchers call redundant presentation, requires the learner to work at reconciling the two verbal forms while also dealing with the visual form. It’s as if our interview candidate were watching an animation of the route to follow and reading directional text while the person next to her recited those directions. 

Mayer and Moreno cite studies with a significant shift as a result of reducing redundancy, such as dropping onscreen text and using only narration. An interesting twist they add is that if there’s no animation, students learn better from concurrent narration and on-screen text than from narration alone. The interpretation is that the on-screen text by itself doesn’t overload the visual channel the way it would with the animation there as well. 

Overload from “Hold that Thought” 

The final type of cognitive overload involves both essential processing and “representational holding,” which Mayer and Moreno explain as having to retain visual or verbal information in working memory. For example, if you read about the thermoforming process for drug packaging, and then watch a video showing the process, you have to keep elements of that text in memory during the video, which reduces your ability to select, organize, and integrate. 

One way to avoid this overload is to synchronize—interweave text or audio with the video. Words about the sealing step should arrive as the visual does; a description of the check-weigher should come while the learner sees that device in action.

Researchers cite robust evidence that “students understand a multimedia presentation better when animation and narration are presented simultaneously rather than successively.” Meanwhile, Mayer and Moreno point out that if the non-synched elements are brief—a few seconds of narration followed by a few seconds of animation—there’s less overload, mostly likely because the learner has less representational holding to do: fewer things to keep in mind from the verbal information. 

But what if you can’t synchronize? Then, the recommendation is individualization, or ensuring that you have learners skilled at holding things in memory. If for your work you’re able to match “high-quality multimedia design with high-spatial learners,” you’re all set. Personally, I’m rarely able to manage that.

Final Thought 

I started by comparing the multimedia learner to someone who has to drive through a strange city to make an interview. Mayer and Moreno highlight ways that your design decisions can make that trip far more difficult than necessary. Pick up some learning principles and lessons from this research—and take off a little cognitive load. 

About the Author

Dave Ferguson has worked as an instructional designer in both corporate and public-sector positions, as well as through independent consulting. He shares learning-related interests and ideas at Dave’s Whiteboard ( and the job-aid-focused Dave’s Ensampler (

Be the first to comment
Sign In to Post a Comment
Sorry! Something went wrong on our end. Please try again later.