The Four Ps of Quality Audio

Friday, May 2, 2014

Good audio does not happen by mistake. It is the result of 4 Good Ps:

  1. production
  2. performance
  3. programming
  4. platform choice.

A lot of audio content is good in one of these areas, but rarely all of them. Quality learning content is good in each areas of these Ps.

So, how do we make audio sound crisp and clear, as well as easy to understand in our podcasts, video, screencasts, and e-learning? Let’s consider each of the 4Ps of quality audio.


Audio, whether it stands alone as a podcast or is part of a video, screen capture, or e-learning courseware needs to be well-produced. This means we need to record audio that’s crisp and clear, and record it so there are no distractions and edit it cleanly so it flows naturally.

  • How do we do this? How much time do you have? There’s a lot you can do, but for the sake of brevity, here are a few factors to consider. Make sure you have a decent microphone. A good microphone ensures your voice sounds real. It creates the impression for your listener that you are right next to them. A lot of trainers record screen capture videos and podcasts with headset microphones. These are horrible and distracting. Yes, they’re great for Skype calls and playing games online, but they don’t make you sound as if you’re in the same room as your listener.

  • Make sure you record in an acoustically acceptable environment. Take care of the obvious issues like background noise and music, which are a distraction and make editing impossible. If you’re in an office, find a room where you can’t hear colleagues on the phone or talking about last night’s episode of CSI. Also take care of the less obvious acoustic problems, such as the noise of air conditioners. Good microphones pick up everything from the noise of a space heater to the HVAC unit. Find a room where you can turn off the AC off. Don’t rely on the noise removal functions on editing programs or built into headset microphones because they can never make it sound as good as getting it right the first time.

  • Think carefully about where you position your microphone. The closer your microphone is to the person speaking, the more personal it will sound. Conversely, the further away it is the more impersonal it will be. You don’t want to be so close you sound suffocating. Nor do you want to sound as if you’re shouting from the other side of the room.

  • Edit the spoken word carefully. Don’t mistakenly chop syllables off words or destroy the natural flow of the person speaking. Learn basic audio production techniques that will make voices sound better. This could include graphic equalization—known in the industry as EQ—and compressor. Learn how to multi-track and get the right balance in volume between tracks is essential too.

    My preference when it comes to microphone choice for recording podcasts, screen video, and e-learning content is a large diaphragm studio condenser microphone. This sort of microphone will make your voice sound terrific, but it won’t break the bank. Audio manufacturers such as Behringer and Samson offer USB models that plug straight into your computer for around $60.

    If you’re capturing audio for a video, look into both lavaliere mics and shotgun microphones. Lavaliere mics clip on to your shirt and are designed to focus on sound coming out of your mouth. Shotgun microphones are highly directional and give you flexibility for capturing sound in diverse situations. You can buy cheap lav mics for consumer cameras as cheap as $30 although good quality lavs will range from $150 to $1000. Good shotgun mics start around $150. Cheaper mics tend to be less directional. 


    If you’re recording voice over commentary for a screen capture video and get all of the production right but the artist’s voice is neither clear nor interesting, you will undermine the piece’s quality.

    In my multimedia workshops, I suggest people focus on four things to improve their vocal quality:

  • breathing
  • diction
  • expression
  • microphone technique.

Breathing relaxes your vocal muscles and allows the natural flow of air over the vocal chords. It is the key to developing both warmth and authority. You can learn exercises to help this. Posture also is a key factor to allowing natural breathing. Before you record commentary, take some deep breaths, make sure you are not slouching, and be sure to relax your shoulders and neck muscles.

Good diction helps you sound more credible. By diction, I mean clear enunciation and speaking words at a natural pace so they are easy to understand. Poor diction is slurring of words and dropping off consonants. Practice tricky words and if you find yourself stumbling over some, speak them slowly. Practice words with multiple syllables so they are not compressed.

Expression keeps people engaged. Monotonic delivery puts people to sleep even if you have a lovely rich and resonant voice. Think about varying your pitch, pace, and power (volume), as well as incorporate pauses to emphasize key points. I call these the “4 Ps of Expression. “ (Yes, it seems this article is in Sesame Street terms, sponsored by the letter “P.”)

Microphone technique also is important and will make you sound natural and real. Learning how a microphone captures your voice is critical. Knowing how to avoid things like plosives, which are the popping noises we make when we say the letters B and P.


Every microphone is different, as is each voice. You need to learn how best to position the mic for your voice. The closer you mouth is to the microphone the more intimate you will sound. The further away, the less personal you will come across.


You might have the best production quality and your performance may be superb, but if your program lacks structure or is boring, you’ll lose attention. By program, I am talking about the editorial content of your audio.

It’s important to create a clear and logical structure for your content. Good structure aids comprehension, and enables you to plan a method for keeping your listener engaged and wanting more.

If you’re adding audio to a screen capture or video, your structure will be determined by images rather than the audio itself. But while the audio plays a secondary role to the imagery, your audio will be better if you structure your sentences following media writing conventions. If you’re using audio only, such as in a podcast, good structure is critical and must persuasively answer the question, “How am I helping my listener learn X?” (Substitute your learning objective for the letter X.)

The key to any good media content is starting with a clear objective. For learning professionals, this is easy. We can write the learning objective following Mager’s conventions: active verb, condition, and standard. Or put another way, follow the well-known ABCD approach. 

Without a clear learning objective, the content can end up going all over the place. Tangents can be interesting, but generally people will listen to a podcast or consumer media content with a purpose. The stated learning objective is usually that purpose.

In addition to having an objective, it is important to have a clear structure. This makes it easier for the media consumer to process you content. And within the structure, you will find content is more effective when it is written well.

Radio broadcasters follow a writing convention that we should also incorporate to learning media. These conventions include rules such as using short phrases and choosing short words with a consideration for how the words sound. As well as planning the structure and script, the program is about making the content audio more engaging. This could equate to choosing different voices, accents, music, and sound effects.

Of course, it’s not just about mixing up a bunch of voices or dropping in music randomly. There’s both and art and science to using every single element to reinforce your message. This may seem complex at first which is possibly why it is easy for pieces media to become boring adlibs of poorly thought-through content that is not sharp or focused.

Platform choice

Audio used to be the sensory method of radio and as such was its own distinct medium. Remember the old days when the three mediums were TV, radio, and print? Today, audio is one of several sensory methods that are used on the new medium of the Internet.

Unlike the old days when audio people only had audio to work with—because you couldn’t use video or print on the radio medium—today we have access to any sensory method: audio, video, text, and animation.

In the days of radio, we discovered very fast that some kinds of content did not work well. Content with lots of facts and figures generally failed to keep people’s attention for very long. Now, I know that failure to keep attention for very long can also be the result of avoidable factors like poorly produced, poorly structured, poorly scripted content.

But as a general rule, broadcasters discovered that audio content does not endure itself to facts, complexity, or long duration content. This is because different sensor methods are good for some types of content and not so good for other types.

This shouldn’t be surprising because it is all too common at the office. We know that jokes in emails don’t work well because they are often misunderstood. Emotion does not convey well in emails, and complexity and detail does not convey well in audio.

Audio podcasts are great if the content is narrative-based. By that, I mean it tells a story and does not rely on lots of facts to get the message cross. Audio is not good for facts. This is why financial information such as updates on various currencies, the Dow Jones, and other indexes are hard to follow when you listen to them on the radio.

It’s also why we hear the weather on the radio, we so easily forget. (I know, these days we get the weather from our phones rather than the radios, but once upon a time, it used to be mostly radio.)

So, when I talk about platform for the four Ps of good audio content, I’m essentially saying that audio is not good for every type of learning. Just as doing practical exercises is better to help someone learn psychomotor skills, it’s audio is best for conveying narrative learning content.

This concept is nothing new. In fact, it’s what I’ve been teaching in my workshops on transmedia and 360 degree storytelling for the past 12 years. Each and every sensory method has its own set of strengths and weaknesses when it comes to conveying information. Text is strong for detail, and video is strong for showing action. (The same principle applies for platforms too, but that’s a different conversation.)

About the Author
Jonathan Halls is an author, trainer, and coach. He wrote Rapid Video Development for Trainers (ATD Press, 2012) and was a contributing author to Speak More (River Grove Books, 2012) and the ATD Handbook: The Definitive Reference for Training & Development 2nd Edition (ATD Press, 2014). He is author of the ATD Infoline, “ Memory & Cognition in Learning” (ATD Press, 2014) and has written numerous articles for T&D magazine. Jonathan is an ATD BEST Awards reviewer and has sat on the advisory committees for the ASTD International Conference & Exposition and TechKnowledge.

The former BBC learning executive now runs workshops in media, communication, leadership, and creativity. He is on faculty at George Washington University and facilitates ATD’s Master Trainer Program ™, Training Certificate and Rapid Video for Learning Professionals Certificate program. Jonathan has been training, speaking, and coaching for 25 years in more than 20 countries. He describes his work as “at the intersection of media, communication, learning, leadership, and innovation.”
Be the first to comment
Sign In to Post a Comment
Sorry! Something went wrong on our end. Please try again later.