Digital Radio Is Coming, Pt.2 - March 2009

Outer Front Cover
Contents
Publisher's Letter: Carbon trading may not be needed in Australia after all
Feature: Reviving Old Laptops With Puppy Linux by Warrick Smith
Feature: Digital Radio Is Coming, Pt.2 by Alan Hughes
Project: A GPS-Synchronised Clock by Geoff Graham
Project: New, Improved Theremin Mk.2 by John Clarke
Project: Build A Digital Audio Millivoltmeter by Jim Rowe
Project: Microcontrollers Can Be A Snap by Stan Swan
Vintage Radio: The deadly and the difficult: when to say "no" by Rodney Champness
Book Store
Advertising Index
Outer Back Cover

This is only a preview of the March 2009 issue of Silicon Chip.

You can view 32 of the 96 pages in the full issue, including the advertisments.

For full access, purchase the issue for $10.00 or subscribe for access to the latest issues.

Purchase a printed copy of this issue for $10.00.

Digital Radio Part 2: AAC+ encoders & dec Digital Radio broadcasts will use Advanced Audio Coding (AAC+) of th signals. AAC+ is a complex compression process which greatly reduces RF bandwidth and transmission power necessary to broadcast a high qu W hen we introduced Digital Radio last month, we explained how this completely new technology will commence in Australia in just a few weeks. The panel below summarises what is required to broadcast high quality audio signals. Analog hifi systems and uncompressed digital systems (eg, compact disc) aim to produce a close replica of the original sound but this uses wide bandwidth and can require high transmission power. The AAC+ compression process aims to transmit what your ears and brain perceive but employs much reduced bandwidth and transmission power. Digital Radio – The System In an AAC+ encoder, the following information is sent to the receiver: • A loudness signal • Pitch/timbre signal • Spectral Band Replication signal • Parametric Stereo signal. Let’s take a general view of how these signals are produced and then Your ear measures – at the ear drum 1. Total acoustic power (volume) 2. Fundamental frequency 3. The power of each harmonic frequency 4. The sound in this ear is less powerful at high frequencies than the other ear 5. The sound in this ear is delayed compared to the other ear at lower frequencies 6. Reverberation or echo delay Analog systems and uncompressed digital systems aim to produce an exact replica of the original sound. Uses large bandwidth and can require high power. Your brain hears – 1 Loundess: soft or loud 2. Pitch: low or high (the note played) 3. Timbre: which instrument is being played 4. Sound direction: left, centre, right, above, below, in front or behind 5. Distance of the sound source: close or distant AAC and AAC+ systems transmit what your brain hears and then recreates the sound it needs to produce the same result in your brain. Much reduced bandwidth and lower transmission power are required. 16 Silicon Chip we will have a more detailed look at the AAC+ encoder. At the bottom of the diagram (Fig.1) at right is an 88-key music keyboard but with grey “extensions” added at either end. The grey keys do not exist but match frequencies of sound if the keys did exist. So in music all frequencies above 4kHz are harmonics of the key used. The power of these frequencies is used to control the SBR signal. You will also notice that the frequency difference between keys at the low frequencies is much less than at high frequencies. This follows the brain’s ability to detect a change in pitch. AAC uses a “comb” filter and the bandwidth of each of the “teeth” (ie, individual filters) is not equal – it is much wider at the higher frequencies. Loudness Signal The average ear is most sensitive at about 4kHz and least sensitive at the extremes of the audio band. So to measure the loudness of the sound the comb filter characteristic in Fig.1 is used. The sound power entering the microphone can vary from just audible to the threshold of pain. This is a dynamic range of 1:1,000,000,000,000 or 120dB. It is accepted that if the sound is twice as loud, the measured power increase is +10dB or 10 times the original power. It is also true that if the sound is half as loud the power is one tenth of the original or -10dB. The human ear has a logarithmic response to sound power. This allows us to hear sounds from the faint russiliconchip.com.au by Alan Hughes coders he digital audio s the amount of uality signal. analysed by a Fast Fourier Transform (FFT). It converts a waveshape into the frequencies used to create this shape. The difference in level between the FFT signal and the level from the comb filter is calculated. This difference is measured and sent to the decoder. Any signals where the output of the FFT is less than the comb filter output will not be sent, ie, they are discarded on the assumption that they will be masked and could not be heard. The resulting quantised filter samples are then sent to the encoder. Spectral Band Replication (SBR) Fig.1: the comb filter in an AAC+ encoder has 132 centre frequencies and these are used to generate the pitch/timbre information. The frequency range above 11kHz has less effect on the perceived quality of the sound and requires a lot of data tling of leaves to the roar of jet engines. The digital audio signal is handled the same way, ie, logarithmically. The end result is the loudness signal which is called the scaling factor. Pitch/Timbre Signal Another characteristic of our hearing is “masking. This is where a strong single frequency is heard but softer frequencies in a frequency band either side of this single frequency cannot be heard. The comb filtering effect is created by taking samples of the signal at the centre frequency of the “teeth”. This sampling will cause additional alias signals to be generated. To use this effect the digital signal is siliconchip.com.au The incoming sound power is shown referenced to 1W but the exact value depends on the microphone amplifier’s gain and the setting of the listener’s volume control. Fig.2: the huge dynamic range of audio signals must be compressed before being transmitted. The compression information becomes the loudness signal or scaling factor produced by the AAC+ decoder. March 2009 17 R Rr 20 bit 48kHz STUDIO Parametric Stereo Most people prefer stereo sound of reasonable quality to mono sound. At high rates of compression the addition of direction greatly improves the perceived quality of the sound. But rather than transmitting two highquality channels of sound to create stereo, it is more efficient to transmit a mono sound signal and add direction information. Direction information consists of time differences between the left and right ears at mid-frequencies and strength differences at higher frequencies. This is due to the sound having to travel around the head. So when a transient occurs, the time difference is measured between the left and right channel and this is encoded along with signal volume differences at high frequencies. Normally only the strongest signals are transmitted and as a result, the kilosamples/s 48 Average all channels to mono STRONGEST FREQUENCY DETECT FFT DETECT SUB-BAND < MASK MASK DAB+ GENERATOR TRANSMITTER 5 24 1 132 TOOTH COMB LOUDNESS WEIGHTING DIRECTION DETECTION 2.5kb/s max for stereo CALC OF REQUIRED BIT RATE SBR Least sensitive <at> 4kHz 1-3 kb/s C AES Serial Data Analog to Digital Converter L to reproduce. As a result the sample rate has been halved to 24kilosamples/ second. This is done by averaging every pair of 48kHz samples. This will reduce the maximum frequency to 11.3kHz. The missing harmonics in the sound will be simulated in the decoder. Unless the sound is a pure tone, the frequencies above 11kHz are harmonics of lower frequencies. The encoder measures and sends the level of the sound frequencies above 11kHz. This level is sent (within the SBR signal) to the decoder to control the level of the regenerated harmonics above 11kHz. If the sound is random, then random high frequencies will be used instead of harmonics. 18 Silicon Chip It splits the signal into individual frequencies 3.072 Mbits/s Lr Microphones Fig.3: an AAC+ encoder takes 48kHz sampled data from the studio, averages it to 24ks/s, removes info below 10Hz, averages it to mono, passes it to an FFT and generates masking info and SBR info. The comb filter produces the loudness weighting info. Also added is direction info for stereo and parametric 5.1 info. LOUDNESS Most sensitive <at> 4kHz SET BIT RATE MULTIPLEXER BIT RATE ADJUST QUANTISATION OF SUB-BANDS ABOVE MASK AAC+ ENCODED DATA subtle sound reflections which cause reverberation are removed. These can be restored by measuring the level of reverberant sound and it can be recreated in the decoder. The fact that the sound has been steered in the right direction will also affect the recreated reverberant sound, making it more realistic. AAC+ encoder details Fig.3 shows the schematic of an AAC+ encoder. The 48 kilosamples/ second digital audio from the studio has every pair of samples per channel averaged to reduce the sample rate to 24ks/s. The encoder will only use the most significant 16 bits of each sample. The signal also has any frequency below 10Hz, including DC, removed. Otherwise it’s hard to keep the decoder synchronised. The main digital audio signal has all time-coincident samples averaged to produce a monophonic signal. A comb filter samples the audio at the frequency of each “tooth” of the comb as shown in the AAC+ comb filter diagram (Fig.1). The level of each filter is modified by the loss shown on the vertical axis of the filter diagram and added together. This is the loudness signal. The samples are stored so that the difference between the current sample and the previous sample is sent to the decoder. So the resulting signal for transmission says makes it louder or softer. The mono signal is also fed through a Fast Fourier Transfer function which will separate the waveform into individual frequencies. The strongest frequency is used to increase the sensitivity to the surrounding frequen- cies. This characteristic is added to the loudness weighting. Now each frequency’s level is compared with that signal in that frequency band concerned. The component frequencies are compared with the respective masked signal level. If the incoming frequency is higher in level than the masked signal, then the difference in level will be quantised for transmission. The calculation of the bit rate will rank in order the largest difference in level to the smallest non-zero value. If the bit rate produced is higher than what is available, the lowest values will not be transmitted. The SBR (Spectral Band Replication) circuit measures the slope of the decreasing levels of harmonics. It transmits this slope value and the starting level. The direction detection measures the level difference between the input channels and the mono signal for frequencies above 2.5kHz. The same applies to the phase differences for signals between 250Hz and 2.5kHz. Lastly, the time difference of signal transients in each channel is compared to the mono channel. A multiplexer will select the output signals in the pattern required by the standard. So in conclusion, the following information is sent to the decoder (via the encoder and transmission): • Loudness (Dynamic Range Control or Scaling Factor signal) • Pitch/timbre (Quantised filter output signals & SBR signal) • Direction (Parametric Stereo/5.1 signal). Next month we will look at digital radio transmission and the subsequent program decoding in the receiver. SC siliconchip.com.au