PCM

From LavryEngineering
Jump to navigation Jump to search

Overview

The term "PCM" stands for "Pulse Code Modulation" and refers to one system for encoding digital audio after AD conversion. The term "Linear PCM" is used to describe a form of PCM encoding where the "divisions" are of equal size in both the amplitude domain and the time domain. In the amplitude domain this means the input voltage is divided into equal voltage divisions and in the time domain this means the input voltage is sampled at a constant sample frequency.

Basics

There are different types of encoding that can be used for digital audio, depending on the application and quality requirements. For example; professional audio requires the highest quality attainable as versus telephone communication which can employ much lower quality audio and still be acceptable for speech. One other consideration is the ease with which the digital audio signal can be processed; which in pro audio applications is important for processing such as level adjustment, equalization (or “tone” adjustment), and mixing. PCM is a form of lossless encoding.

These considerations were part of the reason why PCM encoding was adopted by SONY and Phillips for the encoding format of the original Compact Disc (CD) standard. Computer file formats such as WAVE and AIF also utilize the PCM format for similar reasons; and most computer audio software is designed to use one of these formats as its “working” file format.

The fundamental idea is similar to a graph of the audio waveform; where there are evenly-spaced divisions on the horizontal “X” axis that represent time, and evenly spaced divisions on the vertical “Y” axis that represent amplitude. The amplitude typically corresponds to the voltage of the input analog waveform because a voltage waveform is how the analog audio waveform is represented and transmitted between most audio devices. Each time division represents one “clock cycle” at the sample frequency; which corresponds to the time at which the analog signal is either sampled by the AD converter during recording or the time at which the sampled voltage is reproduced by the DA converter during playback.


Similar to the cases of film or video recording and playback; the accuracy of both the divisions in the horizontal and vertical scales (time domain and [[amplitude domain) is directly related to the accuracy of the reproduction (reconstruction) of the original analog audio signal. Due to both the differences between visual and auditory perception as well as the large difference in the frequency at which the information is sampled; the perceived results of variations in the time domain is quite different for digital audio than for film or video. In the case of film or video; the picture may still appear to be the same color or in focus; but the motion will be uneven or start and stop. In digital audio the effects can be much more subtle in nature and can be perceived as changes to the “color” of the sound or lack of clarity (analogous to visual “focus”).

This form of encoding is referred to as "linear" because each division of the time and amplitude domain is equal in size. The advantage of this approach is that it can greatly simplify the encoding, decoding, and processing of the audio signal with very high accuracy. The primary disadvantage is that it requires the same amount of storage space or bandwidth to represent no signal as to represent a "full-scale" (full volume) signal. The same amount of information in the time domain (number of samples) is also required to encode a slowly changing signal (bass frequencies) as to accurately encode the highest frequency (typically 20kHz for music recording).

Non-linear encoding offers advantages in this regard as the amount of data can vary depending on factors such as amplitude and frequency content of the input signal. On example of this type of encoding is perceptual coding used to make mp3 files. This type of encoding is also referred to as “lossy data compression” as versus “lossless data compression” employed in systems like “zipped files” or “compressed folders.” Lossless data compression retains all of the original information which means the original file can be perfectly reconstructed from the compressed file. Lossy compression by definition discards some of the original information that is deemed to be “unimportant” to generating an approximation of the original information that is “close enough” to be perceived as “the same” or in the case of telephone communications; so the speech is recognizable by the listener. The primary advantage is reduced file size or bandwidth required to store or transmit the information.

Reduced file size is the primary reason why mp3 became popular as a file format for downloading music file from the internet for playback on computer and portable devices. With the rapid increase in the capacity of low-cost memory and available bandwidth, more audio enthusiasts are adopting lossless file formats for their personal music applications. Most lossless file formats are used to compress PCM encoded digital audio; so unlike lossy compression files, the data that results from decoding a lossless file is virtually identical to the data in the original PCM file.