Chapter 11 : SAMPLING


In digital audio, the purpose of binary numbers is to express the values of samples which represent analog sound velocity or pressure waveforms.

We have found that there are two basic characteristics of sound, amplitude (level), and frequency (time).

There are also two characteristics of digital audio, sampling (time) and quantization (level).

All digital audio technology is based on sampling technology. From sampling keyboards, delay units, reverbs, drum machines, DAT recorders, DJ mixers, digital audio workstations (DAW) , multi track digital multi-tracks (MDM) and more. While sampling technology has been available for a long time, only in the last decade has the speed of computers and availability of cost effective hardware made sampling with digital recorders the standard in the industry. For instance, in the late 1980s , a Fairlight sampling keyboard was considered the state of the art sampler. At a cost of $75,000 , owning a Fairlight was a dream to most musicians. Today , there are 16 and 24 bit samplers (the Fairlight was only 8 bit) available for less than $2000.

SAMPLING

When a digital recorder takes a sample, it basically takes a snapshot of the audio wave form and turns it into bits which can be stored and manipulated. Compared to analog recording, which is based on voltage recorded as patterns of magnetization in the oxide particles of tape, digital sampling turns voltage into numbers (0s and 1s) which can easily be shuffled around and put back together.

badboy sample
SAMPLE BITS

As we learned in the previous chapter, the more bits used to describe something, the better the clarity and fidelity. An 8 bit sample contains 256 steps of information while a 16 bit sample contains up to 65,536 steps. Obviously a 16 bit sample will have much greater definition. (In a 16 bit system, there are 65,536 different numbers, each number representing a different analog signal voltage).

The bit resolution of a system defines the dynamic range of the system. 6dB is gained for every bit. For example,

8 bits equals 256 states = 48 dB, 16 bits equals 65,536 states = 96 dB. To find the dynamic range of a system, multiply the bit rate X 6.

Low bit samples tend to sound grainy and lack high end clarity. The lack of high end caused by the bits not having the capacity to describe the high frequency waveforms. Some musicians still use 8 and 12 bit samplers because of the grainy sound, especially with drum samples.
bitnquality
 

SAMPLE RATE

The frequency of the 'snapshots' of the audio stream in a single second, is known as the sample rate. Just as in measuring frequency, hertz is used to define the number of samples taken per second.

Definition for sample rate..the number of samples (measurements) taken of an analog signal signal in 1 second.

The sample rate determines the frequency range (bandwidth) of a system. The faster the sample rate, the better the accuracy of getting a true picture of higher frequencies.

Some common sample rates are:
22,050 aka 22.05 kHz - 22,050 samples per second. A sample every 1/22,050 of a sec.
24,000 aka 24 kHz - 24,000 samples per second. A sample every 1/24,000 of a sec.
30,000 aka 30 kHz - 30,000 samples per second. A sample every 1/30,000 of a sec.
44,100 aka 44.1 kHz - 44,100 samples per second. A sample every 1/44,000 of a sec.
48,000 aka 48 kHz - 48,000 samples per second. A sample every 1/48,000 of a sec.

It is amazing that at a sample rate of 44.1 kHz, every 1/44.1 thousandth of a second, a sound is captured - held - assigned a binary number - and released!


The higher the sample rate, the better the quality of the sample. A sample taken at 44.1 kHz will contain twice the information as a sample taken at 22,050 kHz. As in low bit sampling, lower sample rates also lack in high end frequency definition. Remember that high frequency waveforms are fast . To get an accurate picture of these waveforms the sampler needs to take twice as many 'pictures' as the frequency of the waveform.

rates pic
 


High sample rates are better at capturing high frequency waveforms, but if you are sampling lower frequency sounds, such as kick drum, bass, etc., you might consider sampling at the lower rate to save hard drive space.


NYQUIST THEORY

Named after a Bell engineer who worked on the speed of telegraphs in the 1920s, the Nyquist theory states that a wave form must be sampled twice in order to get a true representation. Going back to what we learned about waveforms, a waveform has a positive peak and a negative peak.

nyquist pic

Each half of the waveform must be recorded. There must be two samples per period. In other words, the sampling frequency must be at least twice the highest signal frequency recorded in order to be effective.

Sample rates with Nyquist yield.
22,050 kHz = 11,025 kHz (Nyquist)
24,000 kHz = 12,000 kHz 
30,000 kHz = 15,000 kHz 
44,100 kHz = 22,050 kHz 
48,000 kHz = 24,000 kHz 

It is therefore important to take into consideration the highest frequency of the audio material to be recorded. If a frequency of A-14,080 Hz is to be recorded, a sample rate of 44.1 kHz would be the logical choice to use. 14,080 Hz falls within the range of the Nyquist of 44.1 kHz which is 22.05 kHz.

The choice of sample rate determines the audio bandwidth of the recorder used. Considering that the human hearing range at best ranges from 20 Hz to 20 kHz, a 44.1 kHz sample rate theoretically should satisfy most audio needs.

SAMPLE RATE VS STORAGE

Obviously the faster the sample rate, the better the sample. Another consideration though, is the amount of storage each sample rate demands. The standard formula is that the 44.1 kHz sample rate with a 16 bit recorder will use approximately 5 megs per minute (assuming a computer based system is being used, tape based system limitations are based on the length of tape). If audio is recorded in stereo at 44.1 kHz, the formula is doubled to approximately 10 megs a minute. In a stereo application, two samples are taken simultaneously. A pop tune of about four minutes would take up 40 megs.

At a sample rate of 22,050 Hz, the formula would be half that of 44.1 kHz...around 2.5 megs a minute or 5 megs a minute in stereo. So the size of the storage medium should also be taken into account.

(If you think that audio takes up a large amount of storage space, consider video. Uncompressed video with a sound track takes up 28 MB per second. A CD-ROM can hold only 20 seconds of full screen, full motion video, depending on screen resolution and  audio sample rate).

COMPACT DISC STANDARD

Due to the demand on the speed of the digital circuitry and capacity of the storage medium, manufacturers have selected 44.1 kHz at 16 bits to be the standard sample rate of compact discs.

Also, although it is possible to convert something recorded at 48 kHz to 44.1 kHz, the digital information is degraded as the extra 3.9 kHz of material is removed. If 48 kHz material is played back at 44.1 kHz, the pitch of the material would be drastically higher. 


Therefore, if CD production is the ultimate goal of your project, and you plan to keep all of the material in the digital realm, use 44.1 kHz from the beginning if possible to alleviate any degradation or conversion hassles as the material is recorded to compact disc.


ALIASING

if a 25 kHz waveform is sampled at 44.1 kHz (which has a Nyquist value of 22.05 kHz), the Nyquist rule is broken. 44 kHz - 25 kHz , results in a 19 kHz waveform which is heard as distortion. This is also known as 'foldover'.

.aliaspic1

Once aliasing is introduced into the digital stream, binary can not remove it. It must be stopped before entering the digital stream.

LOW PASS FILTER (anti aliasing filter, smoothing filter)

The low pass filter is used to eliminate aliasing. The low pass filter allows lows to pass. Any frequency above the Nyquist level is blocked. If we take the 25 kHz frequency sampled at 44.1 kHz , a low pass filter would block out the frequencies above the Nyquist level of 22.05 kHz.

filterpic

Remember that wave forms, especially complex waveforms, contain harmonic frequencies that extend beyond the Nyquist level. The low pass filter also blocks these high harmonic frequencies as well.

A low pass filter cannot limit the signal precisely at the Nyquist frequency, so a guard band of several kHz is used which starts a little below the Nyquist level.

You can capture a 20 kHz simply by sampling at 40 kHz to satisfy Nyquist, plus 10% more for the guard band, plus 100 Hz to lock to video. 40 + 4(10%) + 100 Hz = 44.1.

"Now we have to build these anti-aliasing filters [low pass filter] to cleanly pass 20 kHz, but be out (-90 dB) by 22 kHz".

" Truth is you can't dump that much level in that little frequency band without huge phase problems in the analog or digital domain. Therefore phase shift and high frequency ringing are common. 48K is smoother than 44k because of the extra headroom (10%). The problem with 48 k is it uses more media and is another standard"

Stephen St. Croix, "Deep Down Digital - Lessons from the anesthesia front," Mix, vol. 10, no. 10, October, 1996

In other words. Due to limitations of various low pass filters, rather they be Butterworth, Chebyshev, or Bessel filters (which we will study in detail later), due to the slope and speed of the guard bands attenuation, a buffer zone of 10% should also be taken into account when choosing sample rates. If you plan on sampling frequencies above 20 kHz, the extra 2 kHz is needed by the low pass filter. Therefore 48 k sampling may be preferable.

We will go into detail about each component of a sample system in later chapters.


--->Please read George Massenburgs keynote speech at the AES show last fall.  <---

It is an excellent view of digital audio, the industry, and future. 


Class notes:

which sample rate would you choose to record a bass tone, kick drum, flute, etc. ?

Discuss how 8 bit samples could still be better than 24 bit samples.

Discuss various units that we use that use sample technology.


MRT 374