INTERNATIONAL
ORGANISATION FOR STANDARDISATION
ORGANISATION
INTERNATIONALE DE NORMALISATION
ISO/IEC
JTC1/SC29/WG11
CODING
OF MOVING PICTURES AND AUDIO
ISO/IEC JTC1/SC29/WG11
N7290
Poznań, Poland, July 2005
1
Introduction
ISO/IEC
11172-2 specifies a video codec which was originally designed for the application
domain of video for CD storage. The intended picture resolution is CIF or SIF.
A
number of requirements apply in the context of storage and replay of stored
data, which mainly are related to random access:
- The video sequence must be replayable forward
and backward;
-
Fast forward/reverse modes have
to be supported;
-
Editing (e.g. extracting or
replacement of frames) must be possible
2
Technical Solution
The
basic principle of MPEG-1 Video is hybrid coding, a combination of block-wise
motion-compensated prediction and scalar-quantized DCT-based coding of the residual.
The same transform is applied when intraframe mode is selected for a whole picture
or a macroblock. A number of tools and mechanisms were defined aiming in particular
at good reconstruction quality for video sequences of general nature and more
complex content, and to fulfill the requirements of random access:
-
The basic access entity over
the sequence is the Group of Pictures (GOP). Three picture (frame)
types are defined, which are intraframe-encoded (type I) as well as
the P (unidirectional) and B (bidirectional) motion compensated
predictive types. Within a GOP, at least one picture must be an I picture
which must be positioned such that the remaining pictures of the GOP can uniquely
be decoded.
- Flexible prediction mode switching (forward,
backward etc.) is enabled on a basis of macroblocks of size 16x16, with selectable
modes depending on the picture type.
- Accuracy of motion compensation is half
pixel.
- Weighting of DCT coefficients can be applied
in quantization for frequency-specific perceptual customization of the quantization
fidelity. A default quantization matrix is provided, but it is possible to
define alternative quantization tables (e.g. adapted to properties of a sequence),
which are then transmitted as side information in the header.
- DCT blocks are scanned in a zigzag order,
by which coefficients quantized into the zero value are detected and their
positions are converted into a run-length information. The remaining coefficients
are compressed by a variable-length code which expresses both the run-length
and the quantizer-level values.
- Picture sizes, color representations, pixel
aspect ratio, picture rates and similar parameters are conveyed in the header,
which makes MPEG-1 quite flexible for usage with different types of video
sources.
- A decoder buffer and timing model is defined
(called the Video Buffer Verifier, VBV), which allows to design encoders
supporting normative decoder timing behavior. Necessary parameters such as
expected buffer size are included in the video bitstream.
The
bitstream defined in the MPEG-1 video part is structured into a number of syntactic
hierarchy layers, which are video sequence layer, group of pictures
layer, picture layer, slice layer, macroblock layer
and block layer. The syntax of each layer contains the related characteristic
information (e.g. length and structure of a GOP, specific prediction or intra
modes invoked etc.). For the slice layer and above, start codes are defined
which allow re-synchronization.
Subsequent
to the standard text which was published in 1993, the following corrigenda are
integral part of the MPEG-1 Video specification:
- ISO/IEC 11172-2:1993/Cor.1:1996
-
ISO/IEC 11172-2:1993/Cor.2:1999
-
ISO/IEC 11172-2:1993/Cor.3:2003
-
ISO/IEC 11172-2:1993/Cor.4:200X
(in preparation)
3
Application areas
MPEG-1 Video is a format
that is now also widely used for video storage and replay on PCs, video file
transfer over the Internet, etc.