-->

Tutorial: High-Touch Encoding With Microsoft Expression Encoder 2

Article Featured Image
Article Featured Image

Denoise: Off—The source doesn’t have a hint of noise. If there are a lot of textures, Denoise can help to soften them some for easier encoding, but there’s not much texture here, either.

Noise Edge Removal: Off—This is really only useful for noisy edges of analog captures, and even then we’re better off cropping. It obviously doesn’t apply here.

Group of Pictures
B-Frame Number: 2—This sets the number of consecutive B-frames between I- and P-frames in the encode. Normally we’d use one for film and video sources, but for these kinds of motion graphics two is more efficient. This gives us an IBBP pattern, so each B-frame is adjacent to an I- or P-frame. Three B-frames is less efficient in this case, since with the IBBBP pattern the middle B-frame is two frames away from a reference frame (only I- and P-frames can be reference frames), and the P-frames are too far apart, so they require more bits to store the change over four frames instead of three since the previous I- or P-frame. Since the worst-case latency for random access is determined by the maximum number of P-frames per GOP, increasing the number of B-frames will improve latency at a given GOP size. With 15 seconds between keyframes at 30 fps, two B-frames gives us the same random access latency as 5 seconds without B-frames (the old Windows Media Encoder default).

Scene Change Detection: On—This will give us natural keyframes where needed. The codec seems to do a good job of putting them in the right place. I’ve never changed this in Expression Encoder.

Adaptive GOP: On—Adaptive GOP allows the length of a GOP to vary with content. Always have this on.

Closed GOP: Off—This is required to be on for CBR encodes in Expression, but it slightly reduces the quality with VBR encoding. In particular, it can increase keyframe popping, since an Open GOP pattern starts with B-frames before the first keyframe or I-frame. You get BBIBBPBBP and so on, with the B-frames able to reference the last P-frame of the previous GOP. This helps smooth over changes between GOPs, since you have the leading B-frames to spread the change over.

Motion Estimation
Chroma Search: Full True Chroma—The encode is so fast that there’s no reason to not go for the full-meal deal and do Full True Chroma.

Match Method: SAD—For content with very simple, flat areas, the Sum of Absolute Differences (SAD) Motion Match is actually both higher quality and faster than either Hadamard or my normal video/film default of Adaptive.

Search Range: Adaptive—The smallest range works for most of the frames, but there’s some very fast motion when the cards zoom in that needs the bigger range. This is where Adaptive can really improve efficiency, so Adaptive it is.

Output
The Output pane has some of my favorite usability features in Expression Encoder, allowing for the application of rich templates and automatic publishing.First, the template. I picked the Clean template, which has a nice, subtle overlay control, and a pop-up navigation menu (via the thumbnails we made previously) when mousing over the top of the window. It also supports switching to full screen with a double-click. One thing I like about Clean is that the video fills the frame exactly without having to account for the control bar or other elements. So I can embed at exactly 640x480 for a 640x480 clip.

The publish mode (I’ve got the optional Silverlight Streaming publishing plug-in installed; check my blog at http://on10.net/blogs/benwagg for the URL of the latest version) lets me upload the final project to our Silverlight Streaming Service. This is a great way to test or deliver Silverlight projects. You can sign up for a free account with 10GB of storage and 5TB per month of bandwidth.

Before and After
So, how much did all this help? Here are a couple of the more pronounced before and after shots. All are inserted as 100% scale PNG, so there’s no scaling or further compression to complicate the comparison. Note that the FLV came out darker for some reason. I’m not sure what the cause of that was; the VC-1 brightness matches the source. Perhaps something to do with the Mac versus Windows gamma difference on the platform the FLV was encoded on? This actually makes VC-1’s job relatively harder, since the motion graphics are easier to see.

And you can see the actual clips in action online at http:// on10.net/blogs/benwagg.

Detail Improvements
I grabbed a frame right after the transition that really shows the detail difference between VP6 and VC-1; it’s especially striking in the texture of the shirt. The VP6 gets sharper after a keyframe pop, but this is how it starts. VC-1 quality in the card is maintained perfectly throughout.

Deinterlacing Improvements
In another two frames, you can see the effect of my blend deinterlace to hide the fields. Notice the ringing artifacts in the original frame. Encoding fields as progressive is extremely challenging for codecs, since you have high-motion 1-pixel horizontal lines combing high frequency and high detail. I normally don’t like doing a blend, since those double images are also hard to encode, but it was only for a very short duration in this clip, and the deinterlacing filters had a lot of trouble preserving the text perfectly.

So there you have it. Like I said, you won’t want or need to employ this sort of high-touch encoding on all of your projects, but some projects demand it, and the results will pay off.

Streaming Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues