I haven’t blogged here in some time, so I figured that since the topic is relevant this would be a good opportunity to dust off the old blog (webrtc.is / sipthat.com) and post something we have been working on at SignalWire. I am quite passionate about WebRTC and real-time communications so it’s great to be helping bring it to life at SignalWire!
We all know and love <cough> SIP, so we decided we would enable the use of SIP over WebSockets at SignalWire. This new offer also enables functionality like WebRTC with SIP over WebSockets.
This means our customers can now use off the shelf JS libraries, like JSSIP to create basic web experiences for their users, powered by SignalWire. It used to be a bit of a PITA, to create services that provided users with seamless online communications. Now it’s a breeze, and when using SignalWire it’s also very affordable.
For now, we are enabling basic calling and video capabilities, the advanced functionality (including video conferencing) will come in conjunction with a future release of a SignalWire RELAY JS library.
Personally, I can’t wait to see what creative minds will build using this technology with SignalWire on the backend.
The road to the promised land.
For more than 6 years, we have been working on and looking forward to a simpler way to build RTC (Real Time Communications) applications on the web. In order for this technology to truly show its value, the major browser vendors needed to show up.
Mobile, mobile, mobile.
Now that Apple has joined the party in earnest, does the technology have the coverage required in order for developers to make good use of WebRTC on mobile devices? Let’s find out.
Until now, in order for WebRTC to work on iOS, we were relegated to wrapping WebRTC code in Objective-C and Swift, in our native iOS apps. Basically, we had to take the Chrome code and build an app that was sent to the app store for approval and wait in line, like all the other chumps (yours truly included). Conversely, on Android we could run much of that same code from our desktop Chrome apps, on the Android device as well, within reason of course.
Now that Safari and Chrome are shipping compatible WebRTC on mobile, we get to reuse the same code, right!? Well, mostly, they are different code bases, after all.
A word about hardware acceleration.
If ubiquitous mobile video is to take off, the battery life of the device has to last more than the length of the 10 minute video call (ok, I am exaggerating a bit, but I think you get the point) and the performance needs to be at least adequate enough to distinguish facial features. My bar is set a little higher, baby steps for now.
Without h/w acceleration the CPU is likely working too hard to encode the local video and decode the inbound video + service the other processes required at the same time. That really means there needs to be hardware onboard the device dedicated to video coding. That in turn means H.264, since there are very few vendors that offer VP8 or VP9 h/w acceleration.
Question: Does this mean that mobile apps written with VP8 will not be able to deliver decent mobile video conferencing?
Answer: No, not at all, but they will likely not be as performant as those taking advantage of hardware acceleration.
Suffice to say that SVC (Scalable Video Coding) usage would be another reason why we need h/w acceleration, but that’s for another day.
Who’s using what?
The majority of desktop and mobile WebRTC apps written today, are using VP8 for video.
Since Apple and Microsoft both use H.264 and Google uses VP8 and H.264 (recently shipped Open H.264 – on the desktop and mobile). Also, many of the Enterprise RTC developers are already on that H.264 bandwagon.
Question: If Apple and Microsoft devices ship with H.264, what is the case with Google Chrome on desktops and android, are they preferencing VP8?
Answer: Chrome for desktop and android now have H.264 native. Many of the Android devices that ship today all have H.264 hardware acceleration onboard. In order to understand which units have H.264 and hardware acceleration, you can run use the Android APIs to pull a list of available codecs, but in the case of WebRTC, you will only get H.264 in Android WebRTC if there is a h/w encoder on the device.
Is H.264 the answer for WebRTC video?
Here is a recent test:
Host 1 – (before joining):
macOS Sierra, Macbook, Safari (Technology Preview 32)
Host 2 (after joining):
Android 7, Samsung 7, Chrome 55
Host 1 (after joining):
According to the Chrome Status page, Chrome for Android should have H.264. So why is the session barfing when trying to set up video? The logs do not lie…
Safari – offer:
Chrome on android – answer:
Err, huh? No H.264 in reply?
So, I updated to latest Chrome on android (58) and tried again…
… et voilà!!
Next topic, paying the man!
Shipping your product with H.264 enabled, means you may potentially need to deal with the MPEG-LA royalty police for H.264 royalties, but there are some grey areas.
In the case of Apple and Microsoft, where H.264 royalties are already being paid for by the parent vendor, the WebRTC developer is riding on the coattails of papa bear, at least in theory.
Cisco’s generous OpenH.264 offer means that those using this binary module, can do so at potentially no cost:
We will not pass on our MPEG-LA licensing costs for this module, and based on the current licensing environment, this will effectively make H.264 free for use on supported platforms.
Q: If I use the source code in my product, and then distribute that product on my own, will Cisco cover the MPEG LA licensing fees which I’d otherwise have to pay?
A: No. Cisco is only covering the licensing fees for its own binary module, and products or projects that utilize it must download it at the time the product or project is installed on the user’s computer or device. Cisco will not be liable for any licensing fees incurred by other parties.
That seems to mean (I am no lawyer) every developer shipping WebRTC apps supporting Open H.264 binary module, get a free ride. Those using some other binary, or shipping the above source code for that module, could be on the hook for those royalties. That said, since there are royalties being paid by parent vendors where devices are shipping H.264 anyways, developers may not get hassled regardless.
So what did we learn here?
- Apple has joined the party, now we have a full complement of browser vendors!
- If you want to leverage WebRTC video to deliver a ubiquitous mobile and desktop experience for your users, you should likely consider including both H.264 and VP8.
- VP8 is (still) free and powers most of the WebRTC video out there today.
- You can make use of the Open H.264 project and get a free H.264 ride, albeit baseline AVC.
- WebRTC on Android does not support software encoding of H.264, so unless there is local hardware acceleration, H.264 will not be in the offer.
- H.264 is not fully enabled (or buggy) in Chrome 55 (I was using it on Samsung S7 Edge (Android 7), but it does work with Chrome 58.
- WebRTC is not DOA!
- SDP still sucks and ORTC can’t come soon enough!!
As a side note, it would be interesting to see something like this open sourced; VP8 / H.264 conversion without transcoding, if only to service the existing desktop apps currently running VP8 <-> mobile H.264. It would likely overwhelm the mobile device, but it would be cool if it worked!
Disclaimer: The views expressed by me are mine alone and do not necessarily represent the views or opinions of my employer.
For those who missed it, Chrome 38-39 looks like it will be shipping with ORTC 1.1 RTPSender / Receiver APIs as announced by Justin Uberti at Google I/O 2014. This really should not come as any surprise as RTPSender / Receiver APIs are now on track for WebRTC 1.0 integration as well, as per the last W3C WebRTC WG Interim meeting.
The first ORTC API reference draft has been published as a report in the ORCA W3C Community Group today. This is the first step, one of many, in helping the WebRTC community understand the benefits of an object based API for WebRTC. It is still early but we do hope this web-centric approach will be taken seriously by all looking to the future of WebRTC. We welcome any and all feedback.
Robin Raymond, Chair of the ORCA Community Group – will be speaking on RTC Identity models at IIT RTC in Chicago next week. If you are attending please check the agenda for the correct timing of this panel. Robin is always up for some good conversation around ORTC, Federated Identities & Open Peer.
We wanted to get this initial informational document out right away so everyone that has any interest would be able to review the concepts and comment before we get too far along on the actual API.
“No SDP Offer/Answer” is a concept that has been bantered about for some time. It continues to rear its ugly head within the IETF and W3C Working Groups. There is evidence that some major browser vendors will not back the current SDP O/A methodology at all, but some of us continue to pin our hopes and dreams on SDP O/A nonetheless.
We are told by the developers we need something that will not only pass muster with them but will also be something that can be extended and innovated upon, uninhibited by large browser vendors or any vendor for that matter.
According to many, real adoption of WebRTC will not happen if we continue to force everyone to use this SDP Offer/Answer methodology. It is clearly blocking our way forward and the amount of specification documentation remaining needed for the browser vendors to produce a compatible SDP based WebRTC engine in a browser is much more daunting than most are willing to admit.
It’s time for a change.
What does that mean? It means we need to rewrite a portion of the current WebRTC specification and WebRTC API, which also means that many of the demo apps running out there will need some level of modification. Which is O.K. It’s better to break a few demo apps now than to proceed knowing there is a material flaw in the design, precluding the rest of the web from playing along and causing untold pain in future development.
Long live WebRTC.
For those who don’t know, SDP is an old school standards-based text format (pre-1998) for describing media, codecs, state and networking information offered by devices for use in real-time communications and more recently as the proposed format for with WebRTC. I’ve written in the past about my disdain for SDP. To me, using SDP inside the browser for WebRTC seems akin to requiring all new computers use the graphics processing unit from the Commodore 64 for all future graphics engines. As cool as it might have been in its day, it is not exactly up to the task anymore and should be left to the realm of nostalgia.
My original thinking was that the SIP guys would really love SDP in the browser since SDP is the primary media description format they use. But I must recast my opinion to say it’s really bad for the SIP folks as well. Here’s why…
- An increasing need for Session Border Controllers: As it stands, the SDP that comes from SIP devices will need to be re-written and perhaps even put through some kind of Session Border Controller (SBC)/Proxy to maintain compatibility. SIP devices could face update cycles tied to browser updates. There are some companies in the industry who sell proxies that would greatly benefit from compatibility issues (as their role is to fix them) but I would hate to think that the IETF/W3C has been usurped by those vendors to push a solution that is not to benefit the entire internet industry and end users.
- SIP feature-creep: One thing I do know about SIP vendors is they love to add their own extensions to SIP to add their favourite competitive features they offer with their devices/networks. This allows them to claim “support” for something their competition does not support. To that end, I’ve noticed a continuous stream of feature requests to the browser vendors from the SIP world (and I’m certain the alignment to a SIP vendor’s own preferred feature is pure coincidence).
The irony is that if the SIP vendors had insisted that the browsers only offer a good core media/RTC engine, they could have implemented many of the features they now demand from the browsers vendors themselves without waiting for Google, Mozilla, Opera, Microsoft and Apple and the rest of the industry to agree. Talk about a SIP vendor’s dream in being able to offer some unique feature for their network! But now they have to wait and wait and hope and whatever ends up being released in the browsers will work for them and not introduce even more problems and incompatibilities to their networks.
Browser vendors will become the choke-point.
Microsoft argued so strongly against SDP and offer/answer, they have not agreed to support WebRTC and instead produced a competing specification called CU-RTCWeb. Their proposal starts from the premise of having a good media engine/RTC controllable at a lower layer would be far better for the industry and they have (so far) not released their Internet Explorer browser with WebRTC support. Whatever your feelings about Microsoft or their particulars of their proposal, they are right about SDP offer/answer and without their market share being onboard, it will hurt WebRTC’s adoption rate, especially in the Enterprise. Apple is sitting on the sidelines giving no indication their position while the industry sorts this out. I’d love nothing more for the industry than to have all vendors on the same page and agree to implement “something” usable, but it seems the SDP offer/answer model is not helping and in fact hindering that effort.
Where do we go from here?
I just received this email from Skype’s PR firm…
Here is Skype’s official comment regarding Skype for Asterisk. You can attribute this to Jennifer Caukin, spokeswoman for Skype.
“Skype made the decision to retire Skype for Asterisk several months ago, as we have prioritized our focus around implementing the IETF SIP standard in our Skype Connect solution. SIP enjoys the broadest support of any of the available signaling alternatives by business communications equipment vendors, including Digium. By supporting SIP in favor of alternatives, we maximize our resources and continue to reinforce our commitment to delivering Skype on key platforms where we can meet the broadest customer demand.”
Call me crazy but if I have to pay to integrate Skype into my phone system, where I already have a phone service that I am happy with, why would I do that? Maybe I just want to be able to make/receive Skype calls on my SIP-enabled desk phone? If it doesn’t hit the PSTN why do I have to pay? Seems like an odd approach for a company that has a long history of working around POTS, much to the delight of their users.
Integration with SIP is great, don’t get me wrong, but it would be nice if Skype talked SIP and was ‘still’ free. Seems like a massive oversight on behalf of Skype or am I missing something?