|
|
|
|
PredictionBlock predictionDigital video codecs do not actual encode any video block data to be displayed. Instead, for each video block, the encoder encodes in the bitstream an indication of a method that the decoder must use to guess at each video block. This is known as prediction. There are two types of prediction used in video codecs. Intra-frame prediction or intra prediction and inter-frame prediction or inter prediction. Most codecs require all blocks within a frame to be predicted with the same method. H.264 is an exception to that rule. H.264 defines a concept of slices, which can be arbitrary, even non-contiguous, groups of macroblocks. Motion esitmation and inter predictionEach block in the video frame will be most similar to another block in a previously coded frame. If the portion of the video image contained by that block has no motion, then the block being predicted will be most similar to the block at the same location in the previously coded video frame. In that case, the prediction of the data for the block being predicted will simply be copied from the block at the same location of the previously coded frame. If there is motion in the video sequence at the location of the block being predicted, then a block of pixels in a previous frame at some pixel horizontal and/or vertical offset will be most similar to the block being predicted. Each of the horizontal and vertical offset may be positive, negative, or zero. The combination of the horizontal and vertical offset is known as the motion vector. Each pixel represents the video frame image data for a single point. If an object in the video image move a distance that is a non-integer number of pixels, then a more accurate prediction is made by interpolating the data between pixels in the reference frame. Sub-pixel interpolation at half-pixel locations, as shown in figure 7.2.1.1, is defined by most video codec standards. The H.264 specification allows for coding video using quarter-pixel interpolation, as shown in figure 7.2.1.2.
A video encoder performs a search of block sized rectangles of data in previously coded frames to find the closest match to the video data being encoded. This searching process is known as motion estimation. The resulting motion vector does not necessarily correspond to the actual direction of motion of a moving object in the video sequence. Due to repeating patterns in the image data, cyclical motion in the video sequence, or the overlap of one object over another within the view of the camera, the motion vector might point any which way, though most often in most video sequences, the motion vector will correspond directly to the direction from which an object has moved from the previously coded frame to the frame being predicted. An encoder willing to spare no expense for the best possible compression will search the entire content of all 1/2 or 1/4 pixel interpolated block positions within all stored reference frames for the motion vector that points to the very best prediction source block position. An encoder looking to create the best quality video image when compressing a movie in order to master a DVD disc might conduct such an extensive search. Video systems that must encode data in real-time such as live television broadcast systems, digital camcorders, and video conference systems do not have the time to conduct an exhaustive search of the reference frame(s). Instead, such real-time video encoders typically search a small number of selected motion vectors around the block being encoded. After finding the closest match at full pixel positions, a search of 1/2 or 1/4 pixel locations might be conducted to attempt to find a more accurate prediction. A frame or slice that is inter predicted is typically referred to as a P frame or P slice. Some video codecs allow bi-prediction. Bi-predicted frames or slices are referred to as B frames or B slices. A bi-predicted block is predicted by taking a weighted average of two reference blocks, typically in two different reference frames. Bi-prediction is typically used by the encoder to predict based on data from a reference frame earlier and a reference frame later than the frame of the block being predicted. Note that frames are not necessarily coded in the bitstream in the same order that they are displayed. Most video codecs allow a later P frame to be coded before a B frame so that the B frame can be predicted between two P frames. Intra predictionInter prediction tends to give an accurate prediction of each block due to the high degree of similarity between consecutive frames in typical video sequences and the fact that inter prediction predicts each block by copying the most similar block from one or more reference frames. However, at a scene change in a video program, the first frame of a new scene will be poorly predicted from the last frame(s) of the previous scene. In such a condition, intra prediction is used. An intra predicted frame or slice is known as an I frame or I slice. In intra prediction, the contents of a block are predicted from data in neighboring blocks as shown in figure 7.2.2.1. Since, in most video codecs, blocks are coded in left to right, top to bottom order (known as raster scan order) intra prediction is performed using data in blocks above and to the left of the block being predicted. Specifically, the data used is from:
Every video codec defines modes of intra predicting blocks. The mode used to encode the prediction of a block is chosen based on the textures and gradients in the video source data. To predict a block amid a flat field of color, an intra prediction mode might be used that copies the lower right pixel of the block above and to the left into every pixel postion in the predicted block, as shown in figure 7.2.2.2.
To predict a block amid a left to right gradient, an intra prediction mode might be used that copies the lower line of the block above to every line in the predicted block, as shown in figure 7.2.2.3.
To predict a block amid a top to bottom gradient, an intra prediction mode might be used that copies the rightmost column of the block to the left to every column in the predicted block, as shown in figure 7.2.2.4.
To predict a block amid a diagonal gradient, an intra prediction mode might be used that copies the lower line of the block above and the block above and to the right to the diagonally down corresponding pixel positions in the predicted block, as shown in figure 7.2.2.5.
These are just a few intra prediction modes. Different variations of intra prediction modes exist in different video codecs. Aside from its value at scene changes, prediodic intra predicted frames within video sequences are valuable for other important reasons. Occasionally an error might occur in a video program due to dust on a DVD disc or interference in a broadcast transmission. The error can cause one or more corrupted blocks. Inter predicted frames can copy and multiply the corrupted block from one frame of video to the next. Since an intra predicted frame does not depend on previously coded frames, it will cause a complete, independent, fresh redrawing of the video image, which will correct any errors encountered. Periodic intra predicted frames in a video sequence also enable methods known as trick play, such as fast play (fast forward) and reverse play (rewind). In a fast playing mode, the video decoder shows only some of the coded frames at regular intervals in the frame sequence. In fast playing mode, the decoder does not have time to draw all inter predicted frames in order to determine the correct frame at the interval required. If intra predicted frames are included in the sequence then the decoder can skip P and B frames up to the next I frame that it needs to draw the next frame at the required interval. Similarly, playing a video sequence in revers requires the decoder to move backwards in the video sequence and draw frames in reverse order. This can only be performed if the decoder can use periodic I frames in order to determine how to draw groups of dependent P and B frames for the reversed video sequence.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||