Tutorial: How to Leverage IBM Watson Media’s Latest Interactive Webcasting Features
Captions
One of the benefits of live streaming on IBM Watson Media is being able to harness the artificial intelligence power of Watson to generate captions synchronized with video. I haven’t performed any rigorous tests to compare Watson-powered IBM captioning to other solutions, but I can tell you that you still need to review and edit the transcript if you want accurate captions.
When my clients require captions, I still outsource the captioning service to humans. I still like the ability to automatically generate captions and use them as a starting point to work from if I decide to handle to captions in-house. The best part of automatically generated captions is that the initial timing decisions are already made for you, and it is very easy to edit your caption file with the video on one side and the caption text on the other side (Figure 4, below).
Figure 4. Editing AI-generated captions side by side with the video
I know artificial intelligence (AI) and machine learning are still works-in-progress, but I was a bit surprised with some of the mistakes I encountered that I would have hoped machine learning would be above making. The video I used to test these features for this review shows me demonstrating the over/under method of cable-wrapping. Watson correctly spelled “wrapped” the first time I said it, but the next time, in the same sentence, it spelled “wrap” as “rap.” It also doesn’t add any punctuation.
Editing the captions is a breeze. This process is assisted with a spell-check and automation that I found more intuitive than those in Word. Yellow lines underline the areas that Watson wants you to take a look at first, including the incorrect “rap” (Figure 5, below).
Figure 5. Watson underlines words in captions that it thinks you should check.
When I started to add punctuation, I was pleasantly surprised that when I added a period, the next letter was automatically capitalized. This is hardly mind-blowing behavior, but it’s smarter than the default setting in Word. Viewers can toggle the captions on and off easily, similar to how other video players handle captions.
Exporting the captions is easy too. The only available format is a VTT (video text tracks) file, which is a W3C standard to display timed text in connection with the HTML5 <track> element. This format isn’t supported by my NLE, Adobe Premiere Pro. I found several online VTT-to-SRT conversion tools, but I didn’t test them, so I cannot comment on this workflow.
Incorporating advanced interactivity and accessibility features into webcasts and subsequent on-demand videos is becoming increasingly important. The IBM Watson live-stream solution offers integrated options to handle these needs. These features still need a bit of work, but as long as you understand the current limitations, they can be useful for producers and viewers alike.
Related Articles
…and how to solve them. Bandwidth limitations, latency issues, and device compatibility challenges prevent viewers from experiencing ideal video streaming. Here's what to do.
06 Aug 2019