The relationship of the space/time domain to the frequency domain
Any arbitrary waveform within a fixed interval can be created by
adding some (positive or negative) multiple of a constant and some
multiple of a sine wave with a period equal to the interval and
some multiple of a sine wave with a period equal to half of the
interval and some multiple of a sine wave with a period equal to
one third of the interval and so on. Another way to say the same
thing is that any waveform can be synthesized by the summation of
sine waves of integer multiples of the frequency corresponding to
the interval period, starting with the multiple zero, with each
frequency component weighted by a coefficient. This is expressed
mathematically in figure 7.4.1 where Ci is a series of frequency
weighting coefficients that can be used to create the desired
waveform.
|
When a waveform is expressed as a series of frequency weighting
coefficients, that is known as an expression of the waveform in
the frequency domain. If the waveform is an audio signal
representing a change in air pressure over time then the signal
can be identically transformed from the time domain to the
frequency domain and back. This is most famously known as the
Fourier transform.
One interesting simple example is the creation of a square wave
from the summation of frequency multiples of a sine wave. In the
frequency domain, a square wave can be expressed as
[0, 1, 0, 1/3, 0, 1/5, 0, 1/7, 0, 1/9, ...]
Figure 7.4.2 shows these frequency multiples of sine waves
weighted by their corresponding weighting factors.
When those sine waves are all added together they approximate a square wave, as show in figure 7.4.3
The more frequency components added together with the
appropriate patten of frequency domain coefficients, the closer
the result looks to a true square wave. In fact, any abitrary
real waveform in time or space is generally represented by an
infinite series of frequency domain coefficients.
The relationship of discretely sampled, limited resolution space/time to a limited number of frequency domain coefficients
When a waveform is sampled at a fixed number of discrete
sampling intervals within a larger waveform interval, the
waveform is represented by a fixed set of samples. A limited
number of space/time domain samples can be transformed to an
equal number of frequency domain coefficients. This is most
famously known as the discrete Fourier transform (DFT).
When space/time domain sampels are represented digitally with a
fixed number of bits, the number of bits used for each
space/time domain sample corresponds to the number of bits
required for each frequency domain coefficient. A specific
algorithm for performing such transforms efficiently in digital
computers is known as the fast Fourier transform (FFT).
Notice that the number of bits and the number of samples in the
space/time domain corresponds exactly to the number of bits and
the number of coefficients in the frequency domain. The
transformation from one domain to the other is one-to-one and
reversible but does not provide any data reduction or
compression.
Two-dimensional space to two-dimensional frequencies
The transformation of spatial domain samples to frequency domain
coefficients can be extended to two-dimensions. This is most
famously accomplished with the two-dimensional discrete cosine
transform (DCT), which is reversed with the inverse DCT (IDCT).
The DCT converts a square matrix of spatial domain data to an
equally sized matrix of frequency domain coefficients. The
coefficients on the left side of the matrix correspond to low
frequencies in the horizontal direction and coefficients on the
right side ofthe matrix correspond to high frequencies in the
horizontal direction. Similarly, coefficients at the top of the
matrix correspond to low frequencies in the vertical direction
and coefficients at the bottom of the matrix correspond to high
frequencies in the vertical direction.
A spatial domain matrix with vertical stripes has high
frequencies in the horizontal direction and low frequencies in
the vertical direction. When transformed by a DCT operation,
such a spatial domain matrix will yield a coefficient matrix
with values of large magnitude near the top right and small
values elsewhere. A spatial domain matrix with horizontal
stripes has low frequencies in the horizontal direction and high
frequencies in the vertical direction. When transformed by a DCT
operation, such a spatial domain matrix will yield a coefficient
matrix with values of large magnitude near the bottom left and
small values elsewhere. A spatial domain matrix with a flat
field of nearly identical values at all positions has low
frequencies in the horizontal direction and low frequencies in
the vertical direction. When transformed by a DCT operation,
such a spatial domain matrix will yield a coefficient matrix
with values of large magnitude near the upper left and small
values elsewhere. A spatial domain matrix with a checkerboard
pattern of high and low values has high frequencies in the
horizontal direction and high frequencies in the vertical
direction. When transformed by a DCT operation, such a spatial
domain matrix will yield a coefficient matrix with values of
large magnitude near the upper left and small values elsewhere.
The matrix of residuals derived in the example shown in figure
7.3.1 roughly has vertical stripes of similar values. The
frequency domain coefficients that result from the DCT transform
of that matrix are shown in figure 7.4.3.1.
| -2 |
2 |
-4 |
2 |
|
-1.000 |
2.824 |
7.500 |
-10.737 |
|
|
|
|
|
|
|
|
|
| 0 |
1 |
-6 |
6 |
|
0.079 |
0.146 |
-2.263 |
-0.146 |
|
|
|
|
|
|
|
|
|
| -1 |
1 |
-5 |
3 |
|
-0.500 |
0.1913 |
-1 |
0.4619 |
|
|
|
|
|
|
|
|
|
| 0 |
0 |
-6 |
5 |
|
-1.115 |
0.854 |
-2.851 |
0.8536 |
|
|
|
|
|
|
|
|
|
| Spatial domain residuals |
|
Frequency domain coefficients |
|
Notice that the coefficients in the upper right have larger
magnitude than the coefficients elsewhere in the matrix, as
expected.
The prevalence of low frequencies in digital video residuals blocks
In the coding of most blocks of digital video data, prediction
generates a close match to the source data. As a result,
residuals blocks are most often most similar to a flat field
than to a matrix of stripes or checkers. As a result, the DCT
transform most often yields a matrix of frequency domain
coefficients with the largest values in the upper left and with
much smaller values elsewhere.
Quantization
So far, every step of the video coding process that we have
discussed can be performed to encode video and reversed to decode
a perfect copy of the original video data. We have not thrown away
any data or lost any quality. Quantization is the step where we
throw away some data to achieve compression.
The frequency domain coefficients to be encoded, such as those in
the example in figure 7.4.3.1, fall within a certain range of
values. In the example, the highest value is 7.5 and the lowest
value is -10.737. Using signed binary integers, the full range of
coefficients can be encoded with 5 bit values. This eliminates 3
unnecessary high order bits from the bitstream, reducing the
number of bits in this example by 37.5%.
Next, by further rounding each coefficient to, for example, the
nearest multiple of 4, we remove two more low order bits as shown
in figure 7.5.1.
| -1.00 |
2.824 |
7.500 |
-10.737 |
|
0 |
4 |
8 |
-12 |
|
000 |
001 |
010 |
101 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 0.079 |
0.146 |
-2.263 |
-0.146 |
|
0 |
0 |
-4 |
0 |
|
000 |
000 |
111 |
000 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| -0.500 |
0.1913 |
-1 |
0.4619 |
|
0 |
0 |
0 |
0 |
|
000 |
000 |
000 |
000 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| -1.115 |
0.854 |
-2.851 |
0.8536 |
|
0 |
0 |
-4 |
0 |
|
000 |
000 |
111 |
000 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| Frequency domain coefficients |
|
Rounded coefficients |
|
Quantized binary representation |
|
This yields a matrix of coefficients with 3 bits per coefficient,
which is 62.5% smaller than a matrix of 8-bit video data.
|
Copyright © 2005-2006 Jonah Probell. All rights reserved.
|