Compression

First of all, we choose Super-CIF for BankCON because it defines the better quality of video-conferencing than other standard.

Super-CIF

ITU-Ts H.261 recommendation defines a near studio quality format called Super-Common Interchange Format (Super-CIF), with a frame size of 704 samples per line and 576 lines per frame, for luminance and a subsampling ratio of 4:1:1 for the color difference signals.

Regarding to the compression techniques, we will follow the H. 323 standard because this is commonly accepted around the world.

Speech compression

The different ITU recommendations for digitizing and compressing speech signals reflect different tradeoffs between speech quality, bit rate, computer power, and signal delay. Among various standards, we choose the G. 711 voice standard for speech compression. G. 711 generally transmits voice at 56 or 64 kbps.

PCM, Pulse Code Modulation, is the way an analog signal is converted to a digital signal. The standard for converting analog voice lines into digital ones is standardized in G. 711.

The first thing that needs to be done is sampling the analog signal. After that various options are available to create a digital signal.

Pcm.gif (3007 bytes)

The outcome of a sampled analog signal is called PAM (Pulse-Amplitude Modulation). The problem occurring with a short high peak on a copper line is that it is very sensitive to distortion. The solution was PDM (Pulse-Duration Modulation) where the height of a pulse is converted into pulse-width.

This version is still receptable to nose. It would be better to have all the same pulses which encode the information. And that is called PCM (Pulse Code Modulation) and which is standardized in CCITT G. 711.

When n bits are available to describe a signal 2ⁿ samples can be taken. The 'resolution' of the bit description we use is mainly based on the complexity of the signal and the maximal amount of data our receiving device can handle. For digitized voice we have chosen for an 8 kHz sampling rate and an 8 bit encoding, which resulted in 64 kbps bandwidth.

Video compression

Among various standards, we prefer H. 261, which provides a measure of compatibility across many of the different ITU recommendations. H. 261 is used with communication channels that are multiples of 64 kbps.

Flexible H. 261 implementations give the capability of generating any bit rate, even if it is not a multiple of 64 kbps. This type of codec increases video quality in many cases. For example, there is a 128 kbps session and G. 728 audio is used, more than 100 kbps for video is available (depending on data rates and overhead).

H. 261 is a standard to describe the data organization of a low-bit rate visual communication over telephone lines. It uses DCT and VLC techniques for intra-frame coding and an optional block-based motion compensated coding for inter-frame coding. The frame previous to the target frame is used for MC prediction. The range of the motion vector is �15 pixels with integer values and the size of blocks for matching is 16 x 16.

H. 261 describes the organization of the compressed bit stream, and not how should it be produced. The coded data is arranged in a hierarchical structure consisting of four layers, each with its own header and a number of data blocks. The input picture is partitioned into 8 lines by 8 pixels perline image blocks. The lowest layer containing the quantized transformed coefficients of image block, is called the block layer. The so called macroblock (MB) contains four luminance, Y, and two chrominance, Cr and Cb, blocks. MBs pertaining to three rows of image blocks from a group of blocks (GOB). There are two picture size formats: common intermediate format (CIF) and quarter-CIF (QCIF). The following figure gives arrangement of various types of blocks for a CIF picture.

img36.gif (5781 bytes)

Figure: Organization of H. 261 coded data

As expected, headers contains many different information about the coded picture such as the location of blocks in the picture, method of coding, quantizer step size, and motion vectors. There are four types of coding:

	Intra - transformed-quantized representation of the original input.
	Inter - the difference pixels with the previous frame; zero-motion vectors are not coded.
	Inter with MC - Predicted residual with respect to the previous frame; the displacement (non-zero motion vector) is coded as part of the MB header; no specific method is recommended for block matching.
	Inter MC with filtering - the residual filtered by a predefined filter to reduce block artifacts, which occur routinely at low-bit rate coding.

There are a few requirements of H. 261. To prevent build-up of the prediction or transmission errors, a requirement is that every MB is intra-coded in every 132 consecutive frames. To limit the size of the buffer space, H. 261 has a hypothetical decoding rate (HDR) requirement; for example the transmission rate for QCIF should never exceed 64 kb/s.