No Code, No Kidding: Simplify Your Live Streaming with Norsk Studio
We launched Norsk at Streaming Media East last year. What id3as has been about since day one is we've built large-scale, complex live media workflows for companies like NASDAQ, DAZN, and Arqiva, and so on. We've always wanted to be able to bottle the magic and make it so that anybody could build the kind of workflows that we used to produce without having to deal with the underlying complexity. And in particular, we think that there's a lack of ambition in some solutions that get put together because you take something that's off the shelf and you live with whatever its capabilities are. Dipping your toe into custom live media workflows is an extremely complex task, and Norsk is all about that.
There are two halves to Norsk. One is the media manipulation side, and that's where we'll focus today. So, Norsk Media itself is just a container. What you'll sometimes hear me refer to as the "green code" is the code that our customers and or systems integrator partners write to put together a specific workflow for a potential use case.
But an amazing media workflow is of no use to anybody unless it's running somewhere. And so there's an optional side to Norsk as well, which is our management capability, which will run that will instantiate those workflows on hardware, especially accelerated hardware. We have very wide support for things like NETINT and the AMD accelerators and Nvidia and so on in various different cloud environments such as Oracle and AWS, but also on-premises and increasingly common hybrid-type solutions. But in this tutorial we'll concentrate on the visual side of things with Norsk Media.
So there are two different ways that you can access Norsk Media. One is, as I mentioned, the low-code SDK, and we'll just touch on that a little bit because Studio builds on top of that capability. You can actually program Norsk in any programming language that's gRPC-capable, which is any programming language, but just about everybody uses our TypeScript SDK. We'll touch on that initially and then we'll look at how that's being extended to a visual paradigm and a no-code capability as well. So, forgive the code, but it's super straightforward.
Norsk SDK
This is Norsk SDK 101. This is about the simplest application you can put together.
So it starts by saying, "I'm using Norsk here, this first line. Then I'm going to have something called Input, which is an SRT input. I'm then going to have something called Output, which is WHEP, a WebRTC introduction protocol, so just think of it as WebRTC."
The last line is joining the dots. It says, "I want my output to subscribe to the input, in particular taking the audio and video and that's it. I've got an input, I've got an output, and I've drawn the dots." That's all there is to it.
When you're operating at scale, we've been very firm believers that real transparency and access to runtime information is super important. That's what allows you to go from, "I have 1,000 events running concurrently and they're all really happy to, I need to go and talk to this particular CDN endpoint to check out why it is that they're not accepting the string we're trying to send them." And to that end, we support a standard called Open Telemetry, which means that it's very, very easy to pick up runtime information from a Norsk system into anything.
But one of the ways we visualize that data is using the Norsk workflow visualiser. So if you visualize that last program on the previous slide, what it says is, "We'd like a transport stream over SRT coming in. We'll take the audio and the video from that and we'll publish it as WebRTC." So that's all fantastic.
Those of you unlucky enough to be familiar with the details of WebRTC will hopefully be throwing your hands up in horror at this stage, going, "Adrian, that program has absolutely no chance of working," and you'd be right. But the good news is, Norsk knows that, so you don't have to. So if you want to know, this is what you asked for. If you want to know what actually Norsk had to do in order to deliver that workflow, you can click on the little "Show conversions" button at the top of the screen and it says, "Well, this is what you asked for, this is what I actually did."
So in particular, WebRTC only supports one audio codec, and that's called Opus. You can't even put Opus into a transport stream, at least not without using non-standard extensions. So on the face of it, this program here is doomed to failure, other than the fact Norsk knows that it's doomed to failure and will take appropriate steps to make sure that it works for you. So it says, "It turns out you sent me some AAC as part of that SRT source. I converted that into Opus for you and published that audio and you sent me some video, which is H.264, that's great." WebRTC supports H.264. You might've sent an MPEG-2 video, for example, which it doesn't support, in which case it would've done a similar job converting for you. "And that H.264 was nearly right. It was MPEG-4, MP4-encapsulated, whereas WebRTC insists on AnnexB encapsulation, so I fixed the encapsulation for you."
Now, if words like AnnexB encapsulation mean nothing to you, A) I'm very jealous, and B) that's exactly the point. The fact that WebRTC requires AnnexB encapsulation is of no value to your customers. The people consuming this media experience just want a stream that has fantastic content, great quality, and appropriate latency and so on and so forth. So these kinds of details here are a technical cost of doing business as opposed to anything that adds value to the end product. Norsk is there to get rid of all of that and allow you to concentrate on the end-to-end consumer experience that you're creating.
So here's a Norsk program that does HLS and DASH publication via CMAF. So again, we're using Norsk. This time, we're taking the content in via RTMP. We're going to create an audio CMAF packager, a video CMAF packager, and a multivariate that sits on top of it.
And the rest is just joining the dots. I want the audio CMAF packager to take the audio from the input. I want the video CMAF packager to take the video from the input, and I want the multivariate to take the outputs from the two CMAF packages. So we've got a couple of components and we're drawing lines between them. That's all there is to it.
At heart, that's all there is to Norsk SDK. You've got a whole bunch of inputs, you've got a whole bunch of outputs, and then you can do stuff in the middle and you can manipulate the audio, you can compose various videos together. You can switch from one video stream to another. You can call out AI services like transcription and translation services.
What Norsk allows you to do is just to say, "I'd like to go from SRT and then do something with the audio levels and publish it out." And you just express your desire and none of the detail gets in the way. That approach allows you to build really quite sophisticated applications in a very straightforward way.
Remote Production with Norsk
Here's a remote production application built on top of Norsk, and Norsk itself is not a remote production app. It just makes building these kinds of apps is very simple.
The first view was the scene directive view. They take care of the green room, they assemble shots into a storyboard to say, "Well, initially we'll have a sting and that on its own." And then the next scene after that was Dom or somebody introducing the show. And then the next scene after that is Dom interviewing Tim Siglin or whoever else it might be. And then those scenes as they get published are made available to the vision director. The vision director can copy down whichever scene they want into their working space, see all the shots available in the working space, drag shots into preview, and then push them live.
And some of these shots are direct, some of them have got picture in picture. We've got lower-third overlays and all this stuff happening. And this was a preexisting app. The UI for this is reasonably complex. It's a couple of thousand lines of code. It used to be that the media code associated with this app was over 10,000 lines of code in order to be able to deliver this
Porting this particular application to Norsk took two or three days, and resulted in under 500 lines of media code because it allowed the person working on the code to describe intent and not have to deal with any of the detail, while at the same time having massive capability to be able to deliver these very sophisticated capabilities.
Norsk Studio
Let's jump into Norsk Studio. And rather than show you slides of a no-code environment, I'm going to jump in and actually just show you a no-code environment.
The principles behind Norsk Studio are much the same. You've got a whole bunch of inputs, you've got a whole bunch of outputs, and then you've got a whole bunch of things in between. Now, one of the design philosophies that's really important about Norsk Studio is that Norsk Studio is inherently extensible. So Norsk Studio itself is actually entirely open source with a very permissive license. It just uses the Norsk SDK. There are no special privileges it has. It's not doing anything behind the scenes somebody else couldn't get access to. And our view of visual environments is that visual environments are fantastic until they're utterly terrible.
And so in order to make Norsk Studio a genuinely usable and powerful tool, what you need to do is make it super easy to extend. So, if there's a piece of business logic, for example, that isn't there and out of the box, it's very, very easy for third parties to add that. And most of these components here that are in this copy of Studio, I'm running on the order of about 10 lines of code. It's really very, very straightforward.
Norsk Studio demo:
So that's a 10,000-foot view of the capabilities that Norsk has hand in hand with that. One of the things we take incredibly seriously is automated efficiency. There's some very compelling hardware starting to come onto the market right now--ASICs that will be able to deliver certainly tens of 1080P outputs in one card. And those kinds of numbers are starting to push towards a hundred, which means that I can easily construct service capable of delivering tens or even hundreds of channels at very, very low energy consumption and extremely low cost per frame.
We support out of the box some of the AMD hardware, the NETINT hardware, and of course GPUs like Nvidia. I already talked about our capability with respect to Open Telemetry and how easy it is to plug the metrics that I was showing you earlier in the workflows into monitoring and so on. And that combination of extensibility via the SDK controllability, via the monitoring and so on, interfaces and expressiveness through the visual paradigm, I think makes for a very compelling capability.
So I'm currently teasing Norsk Studio in particular. It's going to get launched formally at NAB this year, and we're really excited to see where our customers, what kinds of solutions our customers build with it.
This article is Sponsored Content
Related Articles
Adrian Roe of id3as sits down with Tim Siglin to introduce Norsk low code live video SDK in this interview from Streaming Media East 2023.
14 Jun 2023
Companies and Suppliers Mentioned