A Buyer's Guide to Multiformat Streaming Media Servers
2012 saw significant progress on several fronts for media servers. Some changes were small but important, such as naming conventions -- Adobe dropped Flash from its media server names, for instance -- while others were much more impactful for the industry going forward -- almost everyone agreeing that Dynamic Adaptive Streaming over HTTP, DASH for short, was worth supporting.
In this year's Buyer's Guide, we'll take a look at a few key features you'll need to know about to make an informed decision.
DASH
Without a doubt, the move toward DASH is a key watershed for the industry: Not only does it address adaptive bitrate (ABR) to handle intermittent network delivery and bandwidth uncertainty, but it also makes use of "plain vanilla" HTTP servers to send segments-chunks-fragments to DASH-compliant players.
So, the obvious first question if we're all moving to HTTP: Why use a media server? The answer primarily has to do with the transition period we are in: Since many legacy devices and players are incapable of playing DASH content, and since many devices -- such as iOS devices -- won't support DASH in the foreseeable future, there is a need for servers to convert between formats.
Note that most servers are not converting between codecs, as H.264 is the dominant codec today, and one panelist at a recent media summit declared that 2012 was the year that WebM died. Still, even for those who are using elementary MP4 files, the need to segment content between various segmented HTTP formats is growing rather than declining.
A good media server will hold on-demand files in a mezzanine format, one that can be segmented variously into one or several of Adobe's HTTP Dynamic Streaming (HDS), Apple's HTTP Live Streaming (HLS), Microsoft Smooth Streaming, and a few other formats. The idea is to keep the mezzanine file intact, which is possible now in all HTTP formats -- HLS now allows for bit range segmentation -- so that the server doesn't need to deal with storing a vast sea of tiny bits.
In addition, several media servers are taking advantage of the DASH Live Profile, allowing content to be streamed live to DASH and HLS simultaneously, through the use of transmultiplexing (transmuxing). Some solutions reaching the market in late 2012 and early 2013 will allow for the transrating of a single highest-common-denominator file into several varying bitrates.
Closed Captioning
A recent law in the United States has pushed media server companies to offer closed-captioning functionality. The solutions range from timed-text, such as SAMI, SMIL, Timed Text Markup Language (TTML, and its less-robust TTML Lite version), and traditional CEA-608/CEA-708 compliance.
The initial Sept. 30, 2012, deadline, imposed by the Federal Communications Commission (FCC), does not make it mandatory for all online content to provide closed captioning, but it does set the ball in motion for the overall trend toward closed captioning.
One of the main issues that will be faced by media server companies and consumer electronics (CE) manufacturers alike is coming up with common timed-text standards. The good news is that the vast majority of streaming-capable devices are also capable of displaying subtitles -- Netflix notes that some of the devices that it serves videos to, such as a few older BD players, TVs, and set-top boxes, are not capable, and no upgradeable firmware exists -- and all new devices, plus a number of smartphone and tablet apps, are also captioning- and subtitling-aware.
Addressing all of these various devices is just the kind of service a good media server will provide, so check out the options on your media server of choice before you buy, since closed captioning will continue to be an important requirement for online video from now on.
DRM and Encryption
As we've moved closer to a common delivery protocol (HTTP) and a Common File Format (CFF, based on ISO Base Media File Format), there is also a move toward a common encryption scheme (CES). The CES will provide the ability to use one of five common DRM schemes.
In addition, several media servers are offering the ability to move DRM "down the stack" to reside at the encoder so that DRM and encryption can be applied before the source ever leaves the encoder to be sent to the media server. In this way, content can remain encrypted through the ingest and delivery portions of the workflow, only being decoded and decrypted at the client player.
Apple's limitations on developers, banning them from using the unique device identifier (UDID), means that HLS doesn't allow DRM integration the way that fMP4 (fragmented MP4) does. As such, when considering a media server, consider how it will handle DRM for HLS-based devices.
And in the Darkness Bind Them
If there's not "one ring to rule them all" when it comes to DRM, there's also not a necessity to multiplex (or mux) content too early in the delivery process. Several media servers offer -- or will soon offer -- the ability to perform "late binding" functionality.
The term "late binding" primarily refers to the ability to keep the audio and video "tracks" of an "online DVD" or stream separate from one another until the last possible moment. In some instances this binding occurs at the client player, with the media server keeping track of which audio track (e.g., language) will be paired with a particular video stream.
In the case of HLS, late binding is not really possible, thanks in no small part to the MPEG-2 Transport Stream (M2TS) heritage of HLS. The requirement to multiplex M2TS into single, interleaved audiovideo packets grew from the requirement to send primary and secondary audio tracks for broadcast, the original industry to use M2TS. The use of DVDs with multiplex audio and video solidified this approach, but the more flexible fragmented MP4/CFF approach has made late binding possible.
Expect to see the Common Streaming Format (CSF) that's based on the ISOBMFF make late binding a common feature in media servers, as it means there will be no required duplication and repackaging of the video for each audio track, allowing content providers to easily provide multiple language tracks for a given video asset without having to send all audio alternatives at the same time.
Conclusion
We've just scratched the surface of today's multi-format media server options. Other areas to consider are the support of complementary player functionality in apps -- such as the ability to play HLS content in Android devices at operating system versions lower than what Google officially supports -- as well as the ability to have an HTTP cache origin server functionality or for the media server to perform dynamic re-encoding.
Media servers will continue to expand functionality, including access to the upcoming DASH 264 subset specification. Stay tuned to StreamingMedia.com to stay abreast of these exciting enhancements.
Five Questions to Ask When Selecting a Media Server
- Do I need multiple streaming formats and adaptive bitrate, or just one of the two?
- Will I use two or more adaptive bitrate formats?
- Is DASH important for my current media server workflow?
- Will HLS be a critical delivery requirement?
- What about RTMP?
This article appears in the forthcoming 2013 Streaming Media Industry Sourcebook.
Related Articles
From 4K to MP4, media server software keeps innovating our industry toward its OTT future. Best of all, there are options right for every budget.
07 Apr 2017
4K and DASH muddy the waters, but the need for servers that can deliver multiple formats and handle closed captioning and ad insertion has never been clearer.
28 Mar 2014