Improving the CD

Home Content Articles La Scena Musicale Search

La Scena Musicale - Vol. 7, No. 2

Improving the CD

by Geoff Martin / October 1, 2001

Version française...

DVD-Audio, 2^nd part

The sound produced by an instrument is a periodic or repeating change in air pressure. A microphone picking up that sound produces a signal which is usually a voltage whose waveform is directly analogous to the pressure wave as is shown in Figure 1. There are many advantages and disadvantages to using this method of representing sound, but one of the biggest problems is noise introduced by either the storage or transmitting media. In order to reduce this noise considerably, the analog representation of the signal can be replaced by a digital one. There are many ways to do this, but one of the most common is to use a method called Pulse Code Modulation or PCM, which is the system used in compact discs as well as DVD-Audio. How does one convert an analog signal into its PCM representation and what are the consequences?

When you sit and watch a movie, it looks like you’re watching movement on the screen. In fact, what you’re seeing is a number of still pictures in quick succession that fool your eyes into thinking that you’re seeing smooth movement. Digital audio works in basically the same way. In order to convert an analog signal into a digital equivalent, we need to sample (or measure) the voltage level at a regular rate called the sampling rate (equivalent to the frame rate in film). As is shown in Figure 2, the measurements are initially used to convert the smooth wave into a staircase representation. As is the case in film, we use this procedure to convert a continuous time to a discrete time signal. As a result, we lose the information between adjacent samples. In fact, it’s been proven that, in storing the sound in this manner, we don’t retain energy that has a frequency greater than half of the sampling rate. That is to say, in order to record frequencies up to 20,000 Hz, commonly accepted to be the highest pitch humans can hear, you’d need to record 40,000 samples per second. This is a very large number compared to the 24 frames per second seen in a typical film. In order to keep any unwanted high-frequency energy out of the system, the signal is first passed though a device known as an anti-aliasing filter which permits low frequencies to pass through but prevents those higher than one half of the sampling rate from reaching the sampling circuitry.

The next step in the procedure is to record the measurements as values that can be stored or transmitted. Unfortunately, it’s impossible to build a system able to make a perfect measurement of the voltage level -- just as it’s impossible to measure accurately a distance of 4.23 mm using a ruler that is marked in millimeters. As a result, we have to quantize or round off the measurement to the nearest value that the system can recognize. Comparing Figures 3 and 4, we can see that the quantization results in an intentional error in the measurements of the levels of the individual samples. The result of this quantization error is noise which, if not taken care of, can interfere with very quiet signals in the recording. The higher the number of quantization levels, the greater the accuracy of the measurements, thus lowering both the degree of possible error and the resulting system noise.

Once the quantization procedure is done, the individual samples are ready for storage. The attractive thing is that we can now simply list the consecutive levels of the samples as numbers. For example, looking at the quantized waveform in Figure 4, the levels of the samples are: 0, 2, 3, 4, 4, 4, 4, 3, 2, -1, -2, -4, -4, -4, -4, -4, -2, -1, 1, 2, 3.

This is essentially a digital representation of our original waveform.

Rebuilding the analog signal from the stored digital signal begins by using a device that receives the string of stored numbers, and converts them into voltages, as is shown by the dotted line in Figure 5. By passing this staircase signal through another low-pass filter, the signal is smoothed (hence the typical name smoothing filter, although it’s also known as a reconstruction filter) and reconstructed into a waveform that closely resembles the original analog waveform, as is shown in Figure 6.

However, as can also be seen in Figure 6, the reproduction is not perfect. This is due to two reasons: firstly, the system does not have an infinite resolution in time. Secondly, the system does not have an infinite resolution in level. In order to improve the system, we have to increase the resolution in each dimension. Compact discs have a sampling rate of 44.1 kHz (therefore using 44,100 samples per second) and thus has a highest possible recorded frequency of 22,050 Hz. This value was set because most researchers agree that humans cannot hear frequencies above 20,000 Hz (this is a note a little more than 6 octaves above middle C). By comparison, DVD-Audio discs are capable of supporting sampling rates up to 96 kHz, resulting in recorded frequencies up to 48,000 Hz -- more than one octave above the maximum frequency recordable on a CD.

The DVD-Audio format also permits much higher resolution in the level measurement. CDs use what is known as a 16-bit system, meaning that the total number of possible quantized levels is 2 exponent 16 or 65,536 different levels (compare this with the total of 16 quantized levels in my example in Figure 4). DVD-Audios support a 24-bit system, using a total of 2 exponent 24 or 16,777,200 discrete levels.

In addition to these two improvements over the CD in the simple audio quality, DVD-Audio discs give the listener the option of having up to 6 channels of audio instead of CDs’ two. The result is a format that is technologically identical to the now-aging compact disc, with refinements that correct the limitations that were initially built into the original format.

See part I

Version française...