Basics of Speech Coding
Most of the recent speech coding algorithms can be categorized as a spectrum coding or a hybrid coding. Spectrum coding models the input speech signal based on a vocal tract model which consists of a signal source and a filter as shown in Fig. 4. A set of parameters obtained by analyzing the input signal are transmitted to the receiver. Hybrid coding synthesizes an approximated speech signal based on a vocal tract model. A set of parameters used for this first synthesis are modified to minimize the error between the original and the synthesized speech signals. A best parameter set can be searched for by repeating this analysis-by-synthesis procedure. The obtained set of parameters are transmitted to the receiver as the compressed data after quantization. In the decoder, a set of parameters for Source and LP (linear prediction) synthesis filtering are recovered by inverse quantization. These parameter values are used to operate the same vocal tract model as in the encoder.
Figure 5: Hybrid Speech Coding
![]()
Fig. 5 depicts a blockdiagram of hybrid speech coding. Source and LP Synth Filter in Fig.5 correspond to those in Fig. 4. Upon parameter search, the error between the input signal and the synthesized signal is weighted by a PW (perceptually weighted) filter. This filter has a frequency response which takes the human auditory system into consideration, thus, a perceptually best parameter selection can be achieved.