Analyzing Streaming Media Server Log Files
No matter how large or small your streaming media infrastructure is, it’s critical to know how it’s performing. In fact, your existence is probably intimately tied to how well your system is working. To understand this, you have to extract as much information as possible out of your log files.
Extracting information out of your log files isn’t rocket science. After all, log files are just text. Every time a request is processed by your streaming server, an entry is written in the log file. Each entry is a single line of text, containing multiple fields, each of which contains information about the transaction. A typical log file entry contains the date and time of the request, the IP address of the viewer, the name of the file requested, and how much of the file was sent. Depending on what streaming format you’re using, you’ll have lots of other data that can be mined for more information.
The challenge is to extract this information and present it in a useful format. Each log entry must be tallied, and some of the data may need to be manipulated to be useful. For example, all log files record how much data was sent. This is useful as an aggregate number to know exactly how much bandwidth you’re using. But on a file-by-file basis, it’s more useful to know what percentage of the file people watched. To figure this out, you need to divide the number of bytes delivered by the encoding bit rate, which may or may not be recorded in another log entry field.
Additionally, what you’ll want to do is export this information to a spreadsheet format so you can compare it to other metrics you may have, as well as create graphs and charts so you can track trends over time. Thankfully, this is the sort of thing that computers are good at—boring, repetitive tasks involving a little math. Given the number of streaming media providers out there, you’d think that there would be a big market for this kind of software. As it turns out, there are only a few viable options available at this point, and we'll look at two that are at opposite ends of the spectrum: AWStats and Sawmill.
AWStats
www.awstats.com
Price: Open source
AWStats is an open source log-processing solution that can be configured to analyze streaming stats. It only offers a minimal set of streaming stats because it’s primarily designed for web-server log processing. However, if you’re just interested in which of your files are most popular in terms of number of views and amount of data sent, it’s a good solution.
AWStats runs on just about any OS you choose, provided it supports Perl 5.00503 or better. It’s available as a download in .EXE, .GZ, .ZIP, and .RPM formats. Installation is fairly straightforward. There’s an install script that walks you through the process. You need to specify where your log files are located and specify a name for your configuration file. Make sure you have write permissions in the directories AWStats uses, or be sure to specify directories that you’re allowed to write to.
After you’ve successfully installed AWStats, I recommend that you do your first "build" from the command line, because if anything goes wrong errors are sent to the command line. Provided everything goes well, you’ll see some status text, detailing the build process. When the process is complete, AWStats provides a report on how many records were processed, how many were dropped, corrupt, old, and finally, how many were added to the statistics database.
To view your stats, open a browser window and enter the URL for the reports interface. The URL will look something like this:
http://www.yourdomain.com/awstats/awstats.pl?config=mediastats
The default AWStats view is a summary of the current month, with a summary of the month’s activity at the top of the page. Below that is a graph of the current year’s activity, comparing unique visitors, hits, and bandwidth used. As you scroll down the page, you can also see traffic broken down by date, day of the week, hour of the day, country, referring host, top ten most popular files, operating system, and browser. If you want more detail, click the Full List link at the top of any summary section, or use the navigation at the left.