Introduction
The H.264 SVC (Scalable Video Coding) is a video compression algorithm that enables effective and efficient transmission of video files over low bandwidth networks. An extension to the H.264 codec standard, H.264 SVC allows multi-platform distribution and adaptive streaming from a single file.
Developed by JVT (Joint Video Team of ITU and ISO), the SVC is an Amendment to the H.264 AVC video compression standard.
H.264/MPEG-4 AVC was developed jointly by ITU-T and ISO/IEC JTC 1. These two groups created the Joint Video Team (JVT) to develop the H.264/MPEG-4 AVC standard as an open, licensed standard for video compression. Most components of H.264 SVC are used as specified in the H.264/MPEG4-AVC standard.
H.264 SVC – Layering Technique
The SVC video stream contains multiple layers in such a way that even if parts of the video stream are removed, the resulting sub-streams forms a valid video stream. So, essentially, an SVC video stream is constructed with a minimum base layer and additional (multiple) enhanced layers.
The layering technique provides a considerable higher degree of error resiliency and video quality with no significant need for higher bandwidth. Additionally, a single multi-layer SVC video stream can support a broad a range of devices and networks.
The subset bit streams are derived by dropping packets from the larger video to reduce the bandwidth required for the subset bit stream. The subset bit steam can represent any one of the following:
- Temporal Scalability: lower temporal resolution (lower frame rate)
- Spatial Scalability: a lower spatial resolution (smaller screen)
- Quality Scalability: lower quality video signal.
Figure 1: H.264 SVC Layering Technique |
Temporal (frame rate) scalability: In temporal scalability, the motion compensation dependencies are structured so that complete pictures (i.e. their associated packets) can be dropped from the bit stream. The temporal scalability is already enabled by H.264/MPEG-4 AVC. SVC has only provided supplemental enhancement information to improve its usage. The base layer has a single video stream with a certain frame rate, and the temporal scalability adds enhancement layers with additional frame rates. For example, base layer might contain CIF-176×144 while the enhancement layers contain 2CIF-352×288 & 4CIF-704×576.
Spatial (picture size) scalability: Using spatial scalability, the video is coded at multiple spatial resolutions. The data and decoded samples of lower resolutions can be used to predict data of higher resolutions in order to reduce the bit rate to code the higher resolutions. Spatial scalability adds additional resolutions in the enhancement layers (Eg. 7.5 frames per second, 15 fps, 30 fps, etc).
SNR/Quality/Fidelity scalability: In this technique, the video is coded at a single spatial resolution but at different qualities. The data and decoded samples of lower qualities can be used to predict data of higher qualities in order to reduce the bit rate to code the higher qualities. Quality scalability improves the SNR (Signal to Noise Ratio) in the enhancement layers (Eg. 128 Kbps, 256 Kbps, 512 Kbps, etc).
Combined scalability: a combination of the three scalability techniques described above.
An SVC stream incorporates multiple streams in a single stream for transmission and storage of video. It is not always necessary to provide all types of scalability for every video stream and hence SVC stream can be customized according to the needs of an application.
Figure 2: Packet loss comparison (H.264 AVC vs. H.264 SVC) |
Advantages
- H.264-SVC has multiple advantages over other techniques:
- Videos encoded using H.264 use lesser storage space for the same video quality as compared to other techniques
- H.264 SVC helps systems to push the same video stream through networks of varying capacities
- Allows the same video stream (even in MegaPixel/HD encoding quality and high frame rate) to be viewed by different hardware clients (Control Room Video Wall, Handheld PDA etc). Each client will get only the stream layers that it can process (in terms of resolution and FPS)
- Error resilience allows continuous video stream, even in cases of network with up to 20% drops, with no obvious video loss noticed during the live stream or recording
- Low Latency makes it a preferred option for video surveillance applications like live PTZ control etc.
Conclusion
Scalability has already been present in the video coding standards MPEG-2 Video, H.264 AVC, and MPEG-4 Visual in the form of scalable profiles. However, with the provision of spatial and quality scalability, the H.264 SVC supports functionalities such as bit rate, format, and power adaptation, graceful degradation in lossy transmission environments, as well as lossless rewriting of quality-scalable SVC bit streams to single-layer H.264/AVC bit streams. These functionalities provide enhancements in transmission and storage applications. H.264 SVC has achieved significant improvements in coding efficiency with an increased degree of supported scalability relative to the scalable profiles of prior video coding standards.