Note: I wrote this about a year ago. During my transition to octopress, I realized I never published it… So here it is.
I really enjoy listening to good podcasts (lately I’ve been listening to Ruby Rogues and JavaScript Jabber a lot), or watching good screencasts. Over the years while teaching at a university I have recorded, and re-recorded hours and hours worth of videos for students to use as reference material. While these days I mostly hang out with tech geeks, there was a former life in which I lived in L.A. and worked as a recording engineer. In my life I have had access to some of the greatest recording gear the world has thus far produced. I’ve also worked on budget projects at home that turned out sounding great. I thought (with a little nudging from @olivierlacan) that I should write up a bit about getting great audio for your podcast, screencast, video, etc.
There are, in my book, four elements to getting great sounds recorded… Audio Source, Environment, Input and Levels. In this post I’ll describe a little about each and hopefully set you on your way to audio recording bliss.
Audio Source
In the audio engineering world, great detail is placed on the actual instrument. Have a $50 beat up piano, but hoping to make it sound like a Steinway? Good luck. No audio engineering trick is going to fix that. So you have to start with good sound. In the case of a screencast or podcast that means your voice. Not a whole lot you’ll be able to do there other than perhaps some vocal exercise. Try not to record just after eating. As odd as it sounds, I also try to stay away from soda while recording. Ever listen to a podcast where someone obviously needed to clear their throat but hadn’t? Not fun.
Environment
The room in which you record will have almost as much impact on the way you sound as your voice itself. Consider what it would sound like if you were recording in your high school gymnasium. Lots of echoes. In fact in something akin to a gymnasium, it is often difficult to understand each word as the echo from the room can be almost as loud as your voice. Every room, outside of really expensive acoustically treated rooms for science, has a sound signature. It is the way the room bounces sound around. Even expensive recording studios have a sound. In fact, it is generally on purpose. Rooms that have no sound reverberation make us slightly uncomfortable. This means that your office, bedroom, closet or where-ever you are going to record has some sound. Consider it’s effects on your voice. For vocal recordings, you want as little reverberation as possible. If you are serious about recording audio, I would consider some acoustical foam.
[Side note… this acoustical treatment stuff is impossible to take down if you “glue” it to the wall the way most manufacturers suggest. I built some small frames to mount it in, and then hung the frames on my wall. Much more transportable, and didn’t demolish my wall.]
If you are not invested enough to put up actual acoustic treatment, consider the amount of hard surfaces in the room. Sounds, especially at the frequency of the human voice, bounces off of hard surfaces quite effectively. Cover hard surfaces with blankets, the thicker the better. At one point in my life I leaned the bed mattress up against the wall to better cover the wall’s hard surfaces. Cover as much of the walls as possible. This won’t make a great recording studio, but for podcasts it will help quite a bit.
Consider any noises that will occur in the room. Is the Air Conditioner blowing? Are there fans running? Get rid of any noise that you can. There is a point of diminishing returns with this. If you want to see just how much sound there is, try recording a few seconds of you sitting silent in front of your microphone and then insert a second of silence in between using your favorite audio editor. You’ll hear all of those background noises disappear and then come back.
Input
Something has to convert the acoustical sound pressure of your voice to an electric signal. This is technically known as a transducer, and more specifically known as a microphone. The analog electric signal from the microphone then needs to be converted to a digital signal (Analog-to-Digital Conversion or ADC for short) that can be captured by your computer (or other fancy digital recorder). These are two individual processes that need to occur. What I’m really talking about here is a microphone and a way to digitize it.
You will read lots of advice online about using a USB microphone as it does both the transduction and ADC work in the same device. Just plug it in and you are off. I disagree. I really disagree. More on that in the levels section later. Headset microphones and other really, really convenient things don’t generally sound so good. You will want to believe that it doesn’t sound “too bad”, but you will be wrong. You’ll come to realize that you were wrong after investing way too much time recording stuff that eventually you’ll wish you hadn’t. If you want professional sounding results, you will need somewhat professional gear.
When considering microphones there are really two classes of microphones, Dynamic and Condenser. Dynamic microphones are generally not “powered”. Traditionally they have a more narrow frequency range (the range of frequency in which they can accurately capture sound). Condenser microphones on the other hand generally require a small voltage (typically referred to as “Phantom Power”), and CAN, but do not always have a wider frequency range. A few words of caution. I’ve heard people say that they prefer a Dynamic microphone because it doesn’t pick up some of the background noise in the room. First, get rid of the background noise. Second, this is similar to saying that you prefer using overly compressed JPG’s so you can’t see the poorly lit photo. A Dynamic microphone generally cannot capture the extreme high end of the human capacity for hearing too well. As a result it may seem like you are getting “less” noise. In reality you would get the same effect by using an equalizer and turning down most of the high end (what would equate to treble on you car stereo).
While there are always exceptions, generally vocals will sound more natural via a good condenser microphone. While top end professional condenser microphones can easily run into the thousands of dollars, there are a lot of decent condenser microphones in the $200 - $300 dollar range. I’ve used lots of microphones in my days. For myself, I have settled on a Rode NT1-A. These days you can get an NT1-A for around $230.
Levels
Both Dynamic and Condenser microphones produce a “mic level” signal. This isn’t actually a strong enough signal for us to record with typical equipment. In order to get microphone levels to a standard level, you need to run the output of the microphone into a “Pre-Amp”. A microphone pre-amp will boost the signal to appropriate levels to be recorded, known as “line-level”.
One of the main issues with recordings I hear on the web involve a bad environment, coupled with low audio levels. The issue generally goes like this: You record something, but your audio level is too low. So you go into your favorite editor and “Normalize” the audio. This brings up the audio level to the highest possible without clipping the signal. But, to your dismay it also brings up all types of hums, hisses, and other audio nasties that were previously not heard… What to do.
The answer to this dilemma is to record audio as “hot” as possible. You want to have your input level on your recording device (your computer, or other recorder) be as close to max as possible without clipping. This is tricky, the human voice is a dynamic instrument. It is easy to get louder as you get excited.
My advice is to use an analog compressor. There are lots of pre-amps that have an analog compressor built in. In previous years I’ve owned all types of pre-amp / compressors. In the end I have settled for a simple device by Joe Meek… You’ll have to excuse the site, they are audio engineers, not designers. It turns out they no longer make the version I have, but it’s closest cousin would be the threeQ.
What’s a Compressor
In audio engineering terms a compressor is an audio processor that limits the rate of increase for a sound once the sound crosses a specified level. There are a few parameters that compressors use. A “threshold” is a signal strength value. The compressor will begin operating on audio once the level is greater than the threshold. Compressors also have a “ratio”. A compression ratio of 4:1 says that for every 4dB (dB stands for Decibel, and is a unit of signal strength in audio) only allow the signal to actually increase by 1dB. Most compressors will also allow you to adjust two more parameters “attack” and “release”. Compression attack is simply how fast you want the compressor to kick in once the threshold is crossed, and release is how quickly you would like the compressor to return to non-compression once the audio drops below the threshold. Not overly complicated right?
Why use Compression
So earlier we talked about how getting the highest audio level without clipping the audio was paramount to creating a good recording. Human vocals are quite dynamic and it is easy for us to get loud enough to cause havoc on the recording system. So we compress the microphone output. This allows the audio level to stay a bit more contained, and allows us to turn up the input level on the recorder to get the best quality. Most professional vocal recordings you have ever heard are extremely compressed. You have to be a really skilled vocalist to convince an audio engineer that some compression isn’t required. When done right it will actually thicken up the vocal sound a bit.
This is why I believe you shouldn’t use a popular USB microphone. USB microphones contain the transducer (microphone), the pre-amp and the Analog-To-Digital converter. As such there is no way to compress the audio signal before it is being converted to digital. This results in not getting the best level possible, and the eventual “Normalize” process which makes everything louder, including the noise floor. In theory you can use some audio software to compress audio after it has been recorded. In my experience this just makes recording video more painful. It requires you to rip the audio from the video, process the audio, and then re-sync the audio back to the video. Not to mention you have already solidified the noise floor.
The last word on levels
When you start working with multiple parts (microphones, pre-amps, compressors, etc), it is important that you maintain an appropriate level through-out your signal chain. Allowing levels to be low on the pre-amp only to bump them up in the compressor will result in unwanted noise. Watch your meters and make sure you have appropriate level though-out the signal chain.
Conversion
If you’ve follow my advise thus far, you still need something to do the analog-to-digital conversion and record the audio. I have been using Mac computers for a long time now. Over the years my analog line input to my Mac has always been suitable for podcast / screencast quality. I simply plug the output of my pre-amp / compressor to the line input on my Mac. If you are on a machine where this is inadequate (I remember having a PC with a cheap “Soundblaster” audio card that was awful!), I would consider buying a small audio interface. I’ve had good luck with M-Audio in the past, but honestly haven’t had the need for a sound card for a while (thanks Apple!). I let my Mac do the ADC, and record the audio.
Take Aways
So… record in a room that has as much acoustical treatment as you are comfortable with (from blankets to acoustical foam), the more the better. Use a decent condenser microphone and run it through a pre-amp that has a compressor. Make sure that your recording device is getting the highest signal possible without clipping the input.