A harmonic sound track mix is a difficult thing to achieve. It should sound good, speech and dialog should be clear and understandable, it should contain sounds and convey a certain atmosphere and sometimes even produce dramatic effects. Dynamics are to be employed in such a way that the soft twittering of a bird can be clearly distinguished from a thunderstorm. Unfortunately, there are quite a number of examples with a questionable quality of loudness and dynamics. A frequent source of irritation is TV sound, which differs markedly in loudness between individual TV stations and, especially, between TV program and commercials, in the sense of "the loudest wins". You may have been in a situation where you wanted to mix your own uncompressed music samples with canned music and it turned out that your own recordings were much too soft - even though the recording levels have been set to zero dB which corresponds to peak level.
The K-System helps you come to terms with such problems and allows you to customize your sound track dynamics. The K-System offers the following advantages:
Loudness-based audio level measurement (RMS, 600 ms time window) helps to achieve harmonic sound track audio levels.
High headroom offers much freedom for dynamic range.
A defined reference loudness allows evaluation of the sound track that can be reproduced in terms of its sound.
When AV shows are mixed in accordance with the K-System virtually no loudness jumps occur when different shows are presented.
The K-System was presented in 2000 by the US sound engineer Bob Katz. His ideas have been introduced to the standard ITU-R BS.1770 for loudness measurement. Bob Katz's website can be found at http://www.digido.com.
In order to understand the K-System we need to look at a few basics:
The purpose of an AV show or the location with its ambient noise where the show is presented play a role for the dynamic range used. A cinema allows a much greater dynamic range then an entertainment program in a car or airplane as the ambient noise in a cinema is much lower. Also the room size plays a role for playback of dynamic audio programs. The graph on the left shows listeners' tolerance of dynamic range in a variety of playback environments.
If the peak level of different audio programs is normalized to dB FS (Full Scale), i.e. to peak level, audio programs with a lower dynamic range show a higher loudness than those with a higher dynamic range. This is illustrated in the graph on the right. This is also the reason why highly compressed adverts are so much louder than uncompressed speech recordings. Recordings differ in their loudness, which has so far not been considered in a level control based on peak level measuring.
Note: The graph above illustrates that the tolerance does not only depend on the playback environment but also on the age of the listeners. For people of 50+ the tolerance is decreasing. A sound track with high dynamic range may cause problems for an older audience, as they do not understand the soft dialogs and consider the loud effects as unpleasant and sometimes even painful.
The K-System uses three different meter scales for level measurement with different headroom:
K-20 ...with 20 dB headroom for a high dynamic range in large cinemas; 0 dB on this K-scale corresponds to -20 dB FS
K-14 ...with 14 dB headroom for home theaters and living rooms; 0 dB on this K-scale corresponds to -14 dB FS
K-12 ...with 12 dB headroom for broadcasting and louder environments, such as department stores, fairs, etc.; 0 dB on this K-scale corresponds to -12 dB FS
On the K-System meter scales the 0 dB points differ, thereby allowing for different headroom. Moreover, audio level control is not based on the peak level but on the RMS level. The RMS level in the K-System is the average value of a time window of 600 ms with defined meter increase and decrease rates which means that it has a strong relation to the perceived loudness. The peak level only becomes relevant when it exceeds zero dB FS and that's why an overload meter will do. According to K-System standards you base the level control of your soundtrack on the loudness. This avoids volume jumps and helps to achieve a lot of freedom for accented effects due to the headroom.
The K-System is based on defined playback and listening conditions. The reference loudness of zero dB on the K-scale is defined at 83 dBC SPL (Sound Pressure Level). The reason is the loudness-based perception of different frequency through individuals. We hear low and high frequencies at a lower loudness than medium frequencies. This tendency is stronger at lower sound pressure levels than at higher, as can be seen in the diagram "Listening curves of equal loudness" below. The curves show the sound pressure level at a certain frequency that is necessary to perceive the same loudness as a frequency of 1000 Hz.
A number of studies were conducted looking at various audio samples to investigate the sound pressure at which different frequencies were perceived as balanced. For the majority of people tested this was at a value of 85 dB SPL. In order to create as uniform and balanced listening conditions as possible the K-System reference loudness was defined as follows:
83 dBC SPL = 0 dB RMSon the K-scale
For loudness calibration at the listening location play pink noise at a level of zero dB RMS on the K-System meter scale via a loudspeaker. With a sound level meter set to "C"-weighted and "Slow" response measure the sound level. At mixer or amplifier adjust the loudness at the listening location in such a way that the sound level meter reads 83 dBC SPL. This measurement is repeated for every loudspeaker. For Wings Vioso RX we have prepared project K-System calibration, which allows you to calibrate your playback system. You should mark your mixer or amplifier volume control with K-20, K-14 and K-12. See Calibrating the monitoring system according to K-System standards.
Note: In addition to the defined loudness level the quality of the monitoring system and the listening room also play an extremely important role. The audio system should sound neutral and be distortion-resistant. For stereo sound the room should have a minimum size of 30 m² and at least 40 m² for surround sound and it should meet certain acoustic requirements. Listening rooms in the cinema area are Dolby- or THX-certified and often have a size of more than one hundred square meters. VDT (Association of German Sound Engineers) has prepared a PDF-file named ”r;r;Hörbedingungen und Wiedergabeanordnungen für Mehrkanal-Stereofonie in Studio und Heim”r; (Listening conditions and reproduction arrangements for Multichannel Stereophony in studios and homes). See www.tonmeister.de.
Near field monitoring reduces room influences and is well suited for analyzing difficult problems but is rather a stopgap for evaluating dynamics and soundtrack loudness, if the available room provides acoustic problems. However, it is conceivable to prepare the soundtrack under less than ideal conditions and do the finishing mix in a certified listening room.
You can activate the K-System in the Global Options under Audio level and have the choice between K-12, K-14 and K-20. After this the following things change:
The audio level display has extra bars for the RMS level which are labeled in accordance with the chosen K-system meter scale.
The peak level display is still available but displayed in gray. If 0 dB FS are exceeded, an overload warning in the form of yellow bars appears.
When audio files are dragged from the Media Pool and dropped into the timeline, the volume level of objects is lowered in accordance with the chosen K-system meter scale by 12 dB, 14 dB or 20 dB. This presetting ensures that the sound is not too loud when played for the first time.
The volume for pre-listening via tab Preview and for importing samples from audio CD is also lowered in accordance with the chosen K-system meter scale by 12 dB, 14 dB or 20 dB, otherwise it would be far too loud.
Note the following when using the K-System:
If you have set Wings Vioso RX for the K-System and want to calibrate your monitoring system, make sure that the volume controls in Windows or in anyother relevant sound card dialogs are set to maximum. After calibration the volume controls must keep their settings. Also the audio mixer input gain must no longer be changed after calibration.
Volume control via RMS level and K-scale in Wings Vioso RX provides a good initial basis for judging loudness. The final decision in evaluating loudness and dynamics of your sound track should, however, be best left to your ears.
The following recommendations for level control of various sound track elements should just give you an idea on where to start:
Speech ... -18 dB to -12 dB RMS
Soft music ... -20 dB to -12 dB RMS
Loud music ... -10 dB to +2 dB RMS
Loud sound effects ... 0 dB to +14 dB RMS
If you export your show as video or EXE presentation, indicate usage of the K-System in the name, e.g. Adventure Brazil K-14.wmv. When the show is presented at an event the presenter knows immediately what volume to set.
For stereo and in dependence on the signal correlation, the defined reference loudness of 83 dB SPL results in about 86 dB SPL or more. According to the legal regulations such a listening location is regarded as a noisy working place. At this volume you should only mix for about two to three hours and then take a longer break.
If the listening room is too small or 83 dB SPL are too loud for you in the long run, you should choose a lower listening volume. Mark the resulting volume setting at amplifier or mixer. Use a sound level meter to measure the value your personal reference loudness deviates from 83 dB SPL so that you will be able to reproducibly set and check this volume in the future. In such cases you should check the final sound track in a larger, certified listening room and correct it if necessary. For permanent installations we recommend optimizing the sound track at the corresponding location.
Using a K-20 meter scale may result in peak levels of up to 103 dB SPL (83 dB + 20 dB headroom). Your loudspeaker system should be able to play sound at these levels without any distortions. Since the sound level of loudspeakers is measured at a distance of 1 m the specified peak values for playback in large rooms must be clearly above 103 dB SPL.
Using K-20 and a listening volume of 83 dB SPL, for example, will clearly make the noise of low-quality or faulty systems audible.
Since high headroom does not allow you to make perfect use of the quantization grid, you should possibly choose 24 bits for the output if your sound card supports this setting. In theory you win 48 dB with 24-bit output as opposed to 16-bit output, as 0 dB FS at 16 bits correspond to minus 48 dB FS at 24-bit quantization. However, only very good 24-bit DA-converters achieve a signal-to-noise ratio of about 115 to 120 dB A, which corresponds to a gain of 20 to 25 dB compared to good 16-bit DA-converters.