Mixdown is a science and an art at the same time. It take ages if not an entire life to master.
I do not pretend that you’ll learn everything through this article and that it’ll make you a pro.
But, hopefully, it should help you to understand better what makes a good mixdown, what tools are used, why and even more…
Without waiting any longer let’s dive in to it with our first chapter: Gain Staging
I. Gain Staging
The volume balance between the elements of your track is at the same time your starting point and the major keypoint to focus in order to ensure a good/excellent mixdown. Sometimes the sweet spot in the relation between two elements is really close, a matter of 1dB or less.
Note: An accurate monitoring system that you know well and can rely on is absolutely necessary in order to make informed choices.
If you take the time to analyse several tracks from the same genre and this in multiple genres you’ll notice some strong volume balance caracteristic with always the same elements on the same foreground/background… To explain this better we’ll visualize 6 different stages between the lowest & loudest elements in the track. Respectively the really far background and the Foreground really close/in front.
1. Really Loud / Exceptionnal (SFX, Explosion…)
2. Foreground (Vox/Lead)
3. Middleground (Drums/Bass)
4. Support Background (Back Vox, Strings/Pads…)
5. Discrete Background (Reverbs, Delays…)
6. Very Far, barely hearable (Textures, Drones…)
You should think of those as flags/stages when balancing your track, place your foreground around -14dBfs and balance the rest around it.
Don’t be scared! Dbfs is explained on the next page.
For genres like Techno, House, Hip Hop, Edm etc… Kick/Bass will be in the foreground with/or instead of the vox.
For genres more acoustic/folk/experimental (reggae, jazz, rock, ambient) etc… Kick & Percussive elements might be Middleground or even lower.
Nothing is settled ! Of course you can always break the “norms” if you feel like it suits your track. This is just some kind of scheme to help you to achieve a structured volume balance.
dBfs (Decibels Full Scale) are the Decibels in digital world, the ones you can read in your daw. You’ll be able to read usually 2 types of dBfs, the RMS (Root Mean Square) and Pk or TP (Peaks or True Peaks).
Quickly/Roughly : RMS would be more the medium/integrated level of a sound and Peak the highest level reached by a sound.
ok, dBfs i got it but why -14 ? It’s really low, isn’t it ?
No, -14 is not low, there is many reasons :
– You’ll add up many elements so with your Kick at -14dBfs peak when you’ll have added everything you’ll be already higher.
– Plugins have some optimal entry levels, which for most are between -24 and -14dBfs (usually between -18 / -14) , which means they’ll operate better if the input sound is in the appropriated range. Feel free to adjust the volume before and after some specific plugins to match the optimal input level if needed, especially if you are running some audio recorded in 32bit, in a 32bit-float environment, it should not affect it (negatively) at all.
– K-System metering, K-14 (and K-12 / K-20)
K-System is a metering system developed by famous engineer Bob Katz which take in consideration headroom for dynamics (and mastering). With K-14 it basically means that you bring your “0dB” to -14dBfs , your 0dB (actual -14dBfs) is now your target for your medium RMS in the loudest parts of the track, peaks may go a bit above (max +8dB, so you keep 6dB of headroom for mastering). It would mean that the track would have +8dB of Dynamic Range (DR) which is already a lot for most modern tracks/genres.
In other words, it allows to easely work lower with loosing your marks.
Of course you compensate a bit by raising your Monitors volume or Audio Interface Output since you are mixing lower in the DAW and sound is coming out lower from the Master.
K-System is easier to see/practice than to understand explained like this, i invite you to check the pictures in the “Illustrations” folder joined along this course.
K-12 & K-20 works the same way but at respectively -12dBfs and -20dBfs, they are mostly used respectively for Spoken works/Broadcast and more dynamic content in the case of K-20 like Jazz, orchestral music etc…
Dynamic Range (DR) is the range between the lowest and the loudest sounds of the track.
Crest Factor (CF) is the range betweent the RMS level and the highest Peaks of the track.
II. Frequency Bands
Frenquency sets the wave length of a sound, it’s pitch. The lower the frequency is in Hertz (Hz) the longer the wave is making the sound low in pitch, while higher frequencies gives shorter waves with higher pitch.
High frequencies are more directional than Lows (Sub frequencies are almost Omni-directional). It means that the sound will changes following the distance/position of the listener from the sound source, and will also change if there is some obstacles (e..g. Speaking in your hand => sound muffled).
Humans can hear from 20Hz to 20000Hz (20KHz). Which are basically the lowest and highest sounds hearable.
Note than 20-30Hz is barely hearable but you’ll feel it instead as a vibration all around & inside ( it takes a heavy monitoring system).
Note that with age humans lost a bit of hearing in the High frequencies by default, most of 50+ years old people can’t hear above 14KHz-16KHz.
Finally to help on communication and work we visualize the frequencies in multiple Frequency Bands :
20 – 40 hz : LowBass / SubBass
40 – 200 hz : Bass
200 – 800 hz : Low Mids
800 – 2000 hz : Mids
2000 – 5000 hz : High Mids
5000 – 8000 hz : High’s
8000 – 20000 hz : High High’s / High end
III. Detailed Steps of a Mixdown
- Set proper session settings (Sample Rate, Bit Depth, BPM…)
- Import audio files & Organize (rename/color code) the session
- Make or adjust your routing (creating buss/submixs…)
- Cleaning/Edition/Gain Riding if needed (Not covered here)
- Volume Balance & static Panning (Refer to part 1)
- Tonal Processing (Part 4)
- Dynamic Processing (Part 4)
- Spatio Temporal Processing : Reverb, delay… (Part 5)
- Adding color, caracter, perceived loudness… (Not covered here)
- Eventual Automations
IV. Dynamic & Tonal processing
By definition, a compressor is an amplificator lowering the output signal when you raise the input signal.
Note: A limiter is a compressor with the output signal remaining constant regardless of the input signal volume.
The usual setting of a compressor are the following:
- Attack: Defines how long it takes the compressor to compress once the treshold volume is reached. For fast percussive sounds, we’ll usually set a fast attack in order to catch the transients.
- Release: Defines how long does it take before the compressor stops to compressing once the treshold has been reached. As with Attack, for fast percussive sounds, we’ll usually set a fast release to avoid from compressing before the next hit triggers.
- Ratio: Defines how much the signal above the treshold is compressed. You can read it upside down, meaning a ratio of 4:1 equals a diminution of 3/4 of the signal above the treshold.
- Knee: Defines if the treshold point is more or less hard (vs. Soft knee). Often represented with a curve, if the curve in at a net angle (Hard Knee) it means anything just under the treshold won’t be compressed, everything above will, even a few 0,x dBs… While a softer knee will allow a smoothier/more transparent compression compressing more progressivelly and not just exactly just when it goes above X value. The curve will be smoothier, more curved.
- Make up gain & Output Gain: Sometimes both are presents, if they are use the output and not makeup gain (subjective recommendation, i find it more clean usually, but there are no rules). However if they are be careful sometimes, make up gain is automatic (or just enabled by default), be sure you aren’t boosting the signal and actully matching the volume to compare before/after compression and hear more precisely the real effect and not be biased by the short term “louder is better” factor. You Output gain to compensate (some of) the volume you loosed with compression and so bring the overall signal (including lower sounds) louder.
4 Basic types of compressors:
Slower, pretty warm, usually known for softening the transients (ex:
tube-tech, fairchild 670…),
History of the Teletronix LA-2A Leveling Amplifier
Field-effect Transistor, faster, pretty transparent (ex: 1173
Slower, good for vocals, bass…. (ex: Teletronix LA-2A)
What is Optical Compression?
Fastest & most transparent (ex: SSL Buss Compressor)
Vintage King’s guide to VCA compressors
Some compressors are working with 2 mono channels, in this case for plugins, there is usually a “Stereo Link” option allowing to link settings for both channels or not. As well as a “M/S” (Mid/Sides) switch, allowing to switch the Left/Right channels of the compressor to Mid/Sides.
Parallel compression: is the process of duplicating your source and compressing it (often heavily) then blending the volume of both sources (original & compressed). Allows to preserve the original signal while blending a color or shaping a bit the sound with the compressed signal.
Sidechain: is the fact to apply a compressor on a source (let’s say the bass) but that the settings triggers based on a other signal than the bass (let’s say the kick). Allowing to compress the Bass each time the Kick triggers to make more room for it and create this reknown “pumping effect”. But Sidechain is also used for lot of other applications, including in broadcast for example to lower the music when someone talks (effect called “Ducking”).
b) Equalizer (EQ)
The Equalizers allow to lower or boost different frequencies bands with different slopes. Additionnally following the type of EQ, they also
are more or less transparent/colored.
It allows to shape the tonality of an audio signal. There is two “big families” of equalizers:
- Passive (can’t boost)
- Active (can boost)
And different types of curves: Shelving, Cut, Bell, Notch… with various settings.
Different EQ types:
- Graphical Equalizer: peaking, most seen in live sound context as room eq.
- Parametric Equalizer with more settings available : boost or susbtractive, Q factor (slope sharpness), Frequency….
Susbtractive equalization and when to use it mostly?
I would recommend to use susbtractive instead of addivitve equalization most of the time. Even when you wish to boost something, remember that by lowering the opposite it can produce the effect of boosting what you wished. You can also do a bit of both, let’s say you
wan’t to make a guitar more bright: Lower a bit the low mids and make sure you cut the lows under the fundamental (72Hz for guit), then
boost just a little bit around 2500Hz to make it appear more in front/bright, and/or 4000Hz for a more airy result.
Additive equalization and when to use it mostly?
I would recommend to boost with eq’s mostly when it is to add color (because you are using a specific eq that a particuliar caracter) or
in some cases when it really needed to bring up some frequencies but don’t go to hard (unless it’s really needed… sometimes it is but
A few advices to master equalization faster:
1- Make sure you can trust what you hear (having a reliable, flat, full range monitoring system, is a calibrated and/or at least well known room, properly placed/calibrated + Use some references and don’t work to long on something!).
2- When cutting, solo what you cut to be sure to remove only what you don’t want.
3- Equalize the elements in context, not in solo too much otherwise you’ll have to keep tweaking them…
4- Don’t use crappy eq’s, your daw’s stock EQ is probably better than most free EQ”s plugins outthere (not all, here is a nice free one which is also dynamic eq) if you can’t afford solutions like FabFilter Pro-Q 3, which is litterally the best digital eq for versatility, ease of use and clean processing without being a cpu hogger at all!
5- Use the Mid/Side function when you can and may need it, it’s lot better than fixed spreaders because you can gradually shape and
spread as you wish.
Most known EQ’s plugins and applications:
– Maag EQ4 : Air Band to bring this specific airy brightness to your sound.
– Brainxworx bx_digital V3: one of the most known and high quality Mid/Sides (M/S) EQ.
– DAWs Stock EQs (they remain good & transparent usually).
Allows to reduce sibilicances (the “Ssss”). Basically it’s an EQ boosting a frequency range (that you can usally adjust) before this frequency band gets lowered by a compressor.
d) Transient Designer
Effect allowing to shape the envelope of sounds, boost/attenuate the attack or the sustain of an audio signal. Mostly use on percussive elements.
Like you would guess from the name, those effects allows to clip an audio signal.
Allow to excite transients and generates harmonics on an audio signal. It’s a good way to give back some presence without touching the volume too much. Often works well in parallel processing. It is highly used in various ways in the industry.
V. Spatio-temporal processing
Reverberations are the persistancy of an audio signal in a room/environment after the extinction of the source signal.
The first setting to take in consideration is the decay/reverb time that might be considered as one the room profile/size settings.
The Early Reflections (ER) allows to set the level of the first reflections of the sound in it’s environment.
Following the reverbs types, you will often have access to the following settings:
– Room/Reverb type (Hall, large/medium/small Room, Plate, Spring….)
– Diffusion/Density amount
– Reverb time/Decay
– Pre-Decay (Allows to delay the arrival of the reverb, beware it can cause some syncing issues)
Convolution reverbs are another type of reverbs where you can usually load the IR (Impulse Responses) of a room to emulate this particular room. Altiverb is one of the most known for example.
Additonally you need to control the input/output of the reverb, usually using an eq to cut bass, shape highs etc…
Spatio-temporal effect emulating the echo of a sound.
A short decay can be heard as a phaser or chorus, we are then talking of phase shifting.
Slapback delay: delay effect easely audible but with very little feedback. Allows to make a voice more ambiant without pushing it away.
It’s an effect from the 50’s.
Doubler delay: delay effect which consist of one repetition late from about 30 to 80 milliseconds (ms). It’s approprieted from voice or additional instruments.
Echo: delay effect which consist of one or multiple audible repetitions.
VI. Mixing Tips
- Prefer susbtractive to additive processing
Numbers are just some “markers” to have in mind:
- 40Hz = real bass | 60Hz= punch | 80Hz= Sub feel
- 120-160Hz & 450-550Hz = punch & attack of the kick usually
- 2500Hz = presence / 4000Hz= brightness
- High High’s = Hi fi / Clarity / Air
- Reverb to emulate interior, delay to emulate exterior
- Techno/Edm/HipHop/Pop… begin your balance with the kick/bass
- Jazz/Orchestral/Acoustic/Experiemental… begin your balance with the main element
- 3 elements particularly affects the volume balance: Arrangement, genre and the artist’s musical aesthetic.
- All instruments have some low mids, what them out to avoid a muddy mix!
- Boost a bit your kick’s fundamental frequency at octave 1 to reinforce the harmony in the track
- Attenuating the low mids (200-400Hz) can help to make some elements more bright (vox, synth, guit…)
- Combine your Low/High Cuts with a Low Shef to smoothen a bit the cutting point, especially when having hard slopes.
Mixdown Critical Feedback Guide (MCFG) : A way to identify strengths & weaknesses of a track.
A one page .pdf with some key points/questions that you can use to ensure accurate and complete feedbacks / analysis.
By following this little methodology, you should be able to point out pretty quickly the strenghts & weaknesses of a mixdown.
This little guide is available for free in direct download by clicking HERE