Related Articles

Profiles and Levels

Jul 1, 2004 12:00 PM, By Steve Mullen

A deeper look into MPEG-2 encoding reveals a versatile compression scheme.


         Subscribe in NewsGator Online   Subscribe in Bloglines  

Charts
Table 1: MPEG-2 Profiles
Table 2: MPEG-2 Levels
Table 3: MPEG-2 Profiles @ Levels

At NAB 2004, Sony unveiled a prototype of its entry into HDV, a camcorder that will record at 1080i. The HDV format employs Main Profile @ High 1440 MPEG-2 compression.

In a simpler time (about a year ago), when video professionals talked about inter-frame compression, we were most likely discussing the MPEG-2 compression used for DVDs. With the introduction of HDV, we learned there were two “types” of MPEG-2: program stream and transport stream. Program stream is the type of MPEG-2 used for DVDs, while the HDV format uses a transport stream.

I amplified this distinction in my “Compression Refresher” article in the April issue of Video Systems by pointing out that program and transport streams are not actually types of MPEG-2, but ways of packing multiple video elementary streams and multiple audio elementary streams. In the case of video elementary streams, I still spoke about MPEG-2 as though there were only one type.

As NAB 2004 approached, it became clear Sony had decided to market its forthcoming 1080i HDV products as using “HDV2.” Then at NAB we saw JVC commit — at least for the immediate future — to what it called the industry-standard HDV format. (I'll have more about HDV, HDV1, and HDV2 in an upcoming column.) Sony talked about MPEG-2 being written to XDCAM at up to 80Mbps. Moreover, while Panasonic said it would not support HDV, the company announced that it would support HD based on MPEG-2. What's going on here?

Some of this complexity can be cleared up very easily. According to Sony and JVC, HDV will be licensed only to the tape-drive products. So even if Sony and Panasonic use the same MPEG-2 encoder as is used to produce HDV, once the data stream is recorded to anything other than DV tape, the format cannot be called HDV. This indicates to me why Panasonic, focused as it is on recording to solid-state SD media, had no reason to join the HDV consortium.

MPEG-2 basics

Before looking at the “types” of MPEG-2, here's a brief review of how MPEG-2 encoding is done. The first thing to understand is that there are no rules about how MPEG-2 is encoded. Just like DV, the criterion for the data-reduction process is whether video is successfully retrieved by a decoder/decompressor. This allows engineers to develop multiple encoding strategies. Naturally, these design decisions involve trade-offs.

Despite the flexible encoding process, there are specific stages involved in encoding. First, video is filtered to remove image noise. (An image may be a video field or an interlaced video frame.) Next, inter-frame compression is performed across multiple images.

The initial image is divided into 16×16 pixel macroblocks. Starting with the first macroblock, a search is made to determine its location in an adjacent image. The first comparison is made at the identical X and Y coordinates. If the macroblock has not moved, its “motion vector” is set to zero. Until there is a match, the comparison macroblock is moved in a methodical manner in all directions at increasing distances from its origin. The displacement (direction and distance moved) before a match is made determines the macroblock's motion vector.

This process is repeated for each macroblock until all motion vectors have been computed between the initial image and an adjacent image. Performing the iterative search for standard-definition video requires a huge amount of computation. This load is doubled for 720p30 and quadrupled for 720p60 and 1080i.

The initial image acts as an Anchor frame, while the adjacent image becomes the current frame. In addition, the set of vectors becomes a Motion Compensation Block for the current frame.

Next, a predicted frame is constructed from the Anchor frame and the vectors contained in the Motion Compensation Block. This predicted frame is then subtracted from the current frame and a difference frame is output. If there was no motion — as in a freeze frame — the predicted frame would match the current frame and the difference frame would be zero and thus very easy to compress. Even when two frames are not identical, a difference frame typically has only a small amount of information.

When a difference frame follows an Anchor frame, the difference frame becomes a P-frame and the Anchor frame becomes an I-frame. A P-frame contains the information needed to recreate a video frame in conjunction with the information in the previous I-frame. (Specifically, the Motion Vector Block that accompanies a P-frame is applied to a previous I-frame to generate a predicted frame — to which the P-frame itself is added, thereby yielding a full-resolution video frame.)

If the encoder has enough memory, two Anchor frames, two Motion Vector Blocks, and two predicted frames can be utilized. Information obtained by subtracting two predicted frames from the current frame is used to generate a B-frame. A B-frame contains the very small amount of information needed to reconstitute a video frame when combined with information from a past I- or P-frame and/or a future I-or P-frame.

The I-, P-, and B-frames are each inter-frame compressed by applying a Discrete Cosine Transform (DCT) to each frame. (A DCT is a lossless, reversible, mathematical process that converts spatial-amplitude data into spatial-frequency data.)

This calculation is made on 8×8 blocks of luminance and chrominance samples and yields a matrix of 64 DCT coefficients. (Contrary to common belief, the DCT process does not reduce the amount of data.) However, by reorganizing the image in terms of “picture detail,” the encoding process mimics human perception, in which fine detail — especially fine color detail — is less well perceived.

Compression itself is accomplished using intra-frame compression upon each image. Intra-frame compression uses a combination of lossless and lossy processing. MPEG compression occurs through the “quantization” of the DCT coefficients. Quantizing is the process of reducing the number of data bits that represent each coefficient. The amount of compression to be applied is determined at this stage.

Using Variable Length Coding (VLC) and Run Length Coding (RLC), lossless data reduction is applied after quantization. Each P- or B-frame's Motion Vector Block(s) is also compressed by these processes. Variable-length encoding identifies the most frequent patterns in the quantized coefficients. These patterns are then represented by codes that are defined by only a few bits. Conversely, less frequent patterns become represented by codes that require more bits.

The VLC process yields a set of numeric values. These values are then run-length encoded. RLC uses a unique code to represent a repeating pattern — such as a series of zeroes. For example, a “run” of 20 zero values can be represented by a 1-byte code to which the repeat count (i.e., 20) is appended. Thus, 20 bytes are reduced to only 3 bytes.

The quality factor

So far I've talked about MPEG-2 as though it were a single technology like JPEG compression. In fact, there are many types of MPEG-2 encoding, each defining a different quality. If we look at DV compression itself — not the methods of recording — there are three variants: DV25, DV50, and DV100. We know that DV25 has three characteristics. The chroma sampling, with NTSC, is 4:1:1. The compression factor is 5:1. And the data rate is 25Mbps (3.5MBps). With DV50, a higher quality level is established. Color sampling is improved to 4:2:2 — equal to uncompressed 601 video. Compression is decreased to only 3.3:1. Both of these enhancements increase the required data rate to 50Mbps (7MBps).

MPEG-2 also has multiple quality levels — levels that have a complex relationship with each other. The relationship is a hierarchical one. Consider three types of MPEG-2: A, B, and C. The least capable decoder that can accept type A need not accept types B and C. A “B decoder” must accept both A and B. A “C decoder” must accept A, B, and C.

MPEG-2 quality levels are not defined as simply as designating them A, B, and C. Rather than a simple linear designation scheme, a two-dimensional definition is used. The first dimension is the choice of an encoding profile. A profile defines how complex the encoding is. Technically, a profile is a “defined subset of the syntax of the specification.” In other words, a profile imposes bounds on how a bitstream is generated. Table 1 defines the series of MPEG-2 profiles.



Table 1: MPEG-2 Profiles
SP Simple Profile
MP Main Profile
422P 4:2:2 Profile
SNR SNR Profile
SP Spatial Profile
HP High Profile



The profiles we are interested in are those that are used for making a DVD and those used for HD. These are shown in green in Table 3. Main Profile is used for DVD, HDV, and ATSC video. It uses 4:2:0 chroma sampling.

For HD production, the 4:2:2 Profile supports 4:2:2 chroma sampling. (It is often called Studio Profile and is non-hierarchical.) These profiles all support IBP GOPs, whereas Simple Profile is restricted to IP GOPs.

For each profile there is a series of levels. A level defines a “set of constraints on the values which may be taken by the parameters of the specification within a particular profile.” Table 2 defines the series of MPEG-2 levels.

Table 2: MPEG-2 Levels

LL Low Level
ML Main Level
H-14 High 1440
HL High Level

As with the profiles, the levels that interest us are those that will be used for making a DVD and those used for HD. DVDs are created using Main Level, which restricts resolution to 720×480 with a data rate of up to 15Mbps. The HDV format uses High 1440, which supports two resolutions: 1280×720 and 1440×1080. The former resolution supports progressive video at up to 60fps. The latter, interlaced video at up to 30fps. ATSC HD makes use of High Level at 19Mbps — even though the level allows a data rate of up to 80Mbps. Table 3 shows three profiles and up to four associated levels.

Looking at Table 3, we can see the relations among the MPEG-2 encoding schemes in use today. When we encode a DVD, we are using Main Profile @ Main Level (MP@ML) MPEG-2. When shooting with Sony's IMX, we are using 4:2:2 Profile @ Main Level (422P@ML). Main Profile @ High Level (MP@HL) is employed for ATSC broadcasts. HDV employs Main Profile @ High 1440 (MP@H-14) MPEG-2.

When Sony and Panasonic talk about advanced MPEG-2 HD formats, they have several options. One option is to use Main Profile @ High 1440 (HDV), but increase the data rate up to 60Mbps. This would greatly reduce artifacts when working with 720p60 and 1080i. Remember, when 720p60/1080i camcorders arrive, they will feed twice the data into the encoder as 720p30 does.

Alternately, 4:2:2 Profile @ High 1440 (422P@H-14) could be employed. The advantage for production is 4:2:2 chroma sampling. Moreover, a data rate of up to 80Mbps is supported. Interestingly, Sony claims the maximum recording capability of XDCAM is 80Mbps.

Panasonic's P2 can record up to 160Mbps, so in theory it could employ 4:2:2 Profile @ High Level (422P@HL) MPEG-2. The advantage of working with 422P@HL is that it supports 1080i with a full 1920×1080 resolution — which is greater horizontal resolution than either HDCAM or DVCPRO HD. The former is limited to 1,440 lines of horizontal resolution, while the latter is limited to 1,280 lines.

No matter what the future holds, multiple profiles and levels are available for very high-quality MPEG-2 encoding. As these options are used, you can use Table 3 to better understand the nature of the MPEG-2 being employed.


Contributing editor Steve Mullen is owner of Digital Video Consulting, which provides consulting and conducts seminars on digital video technology. Mullen can be reached at d-v-c@mindspring.com.



feedback

To comment on this article, email the Video Systems editorial staff at vsfeedback@primediabusiness.com.

© 2008 Penton Media, Inc.

Browse Back Issues
BROWSE ISSUES
   
Millimeter
Jul/Aug 2008
DCP
July 2008
DCP
June 2008
Millimeter
May/Jun 2008
DCP
May 2008
DCP
April 2008
Back to Top