HTTP Streaming: What You Need to Know
Forget, Relearn, Repeat
If you are considering HTTP streaming coupled with adaptive bitrate file chunking to produce HTTP adaptive bitrate streams, what do you need to forget or relearn?
First, forget what you know about keyframes. Typically, a 5–7 second keyframe "rule of thumb" has been in play as a balance between too many and too few keyframes. Most adaptive bitrate solutions have been optimized to have chunks of less than 4 seconds, meaning that a keyframe will need to occur within each chunk.
Second, learn that you likely will have to re-encode content. The need to simultaneously segment the same content across multiple files of varying bitrates means that it is important to use the same segmenting algorithms. Even those companies that state that re-encoding is not needed will also say that audio keyframing must match up between all the files and that audio keyframes must be no longer than the chunk duration—otherwise there will be audio gaps and inconsistency in audio playback.
Third, learn that the GOP is your friend. No, not the Grand Old Party, unless you’re of a particular political bent; rather the Group of Pictures (or Long GOP if you’re familiar with MPEG-2 terminology). Most adaptive bitrate solutions look at groups of pictures (GOPs) and place segment points at the beginning of each GOP; the beginning of each GOP is a full I-frame rather than a P- or B-frame.
Fourth, forget ASF. A staple of Microsoft streaming media for years, the ASF format has been replaced in Smooth Streaming by the MP4 file extension, also known as the ISO/IEC 14496-12 ISO base media file format specification. The MPEG-4 system, which uses the MP4 extension, is based on an early QuickTime player format. According to Microsoft, MP4 has less overhead than ASF and is easier to parse in managed (.NET) code than ASF.
Fifth, learn that MP4 is set to become the key file extension for adaptive bitrate video files. Apple and Adobe have used the MP4 extension for some time as its container structure supports the H.264 video codec and advanced audio codec. While ASF can also contain H.264 video, it’s not as straightforward an implementation for H.264, according to Microsoft. In addition, the MP4 container can also handle other audio and video codecs, such as Microsoft’s VC-1 (a SMPTE standard) and Windows Media Audio (.wma) codecs.
MP4 was designed to natively support what is known as "payload fragmentation" within the file. Given the fragmentation, or chunking, of adaptive bitrate streaming, the MP4 format can be internally organized as a series of fragments.
"MP4 boxes [may be] organized in a fragmented manner," said Microsoft’s Alex Zambelli in a blog post on Smooth Streaming’s implementation of the MP4 system. "[An MP4] can be written … as a series of short metadata/ data box pairs, rather than one long metadata/data pair. The Smooth Streaming file format heavily leverages this aspect of the MP4 file specification, to the point where at Microsoft we often interchangeably refer to Smooth Streaming files as ‘Fragmented MP4 files. …’"
"We say that the Smooth Streaming format is based on the MP4 file format," continued Zambelli, "because even though we’re following the ISO specification, we specify our own box organization schema and some custom boxes. In order to differentiate Smooth Streaming files from ‘vanilla’ MP4 files, we use new file extensions: *.ismv (video+audio) and *.isma (audio only)."
What’s Next? Multicasting and P2P
The use of HTTP delivery—including HTTP progressive downloads, homegrown HTTP streaming solutions used by Akamai, and adaptive bitrate solutions launched by Adobe, Apple, and Microsoft—will provide stop-gap measures when they are implemented in 2010, allowing for a larger quantity of "generic" servers to be brought to bear for large events.
Beyond these solutions, though, highly scalable solutions will require multicasting or some form of peer-to-peer delivery. Multicasting requires a coordinated agreement between router, switch, and LAN/WAN product manufacturers, as well as Tier 1 service providers. While we may eventually see multicasting, other hybrid solutions will include multicasting for the LAN and unicasting for the WAN. Examples of these peer-to-peer delivery solutions include a plug-in from Octoshape as well as Adobe’s forthcoming RTMFP. Peer-to-peer solutions may help address ISP-level broadcast issues at television-sized audiences, even if the pipes from the ISP to the rest of the world are fairly narrow.
Related Articles
When will the industry jettison HTTP-segment-based streaming and buffer-based playback, both of which hold us back? How about right now, our columnist proposes.
10 Sep 2019
Cloud-based encoding service Encoding.com has added HTTP Live Streaming, with presets for iPhone, iPod Touch, and now the iPad.
31 Mar 2010