The road to the promised land.
For more than 6 years, we have been working on and looking forward to a simpler way to build RTC (Real Time Communications) applications on the web. In order for this technology to truly show its value, the major browser vendors needed to show up.
Mobile, mobile, mobile.
Now that Apple has joined the party in earnest, does the technology have the coverage required in order for developers to make good use of WebRTC on mobile devices? Let’s find out.
Until now, in order for WebRTC to work on iOS, we were relegated to wrapping WebRTC code in Objective-C and Swift, in our native iOS apps. Basically, we had to take the Chrome code and build an app that was sent to the app store for approval and wait in line, like all the other chumps (yours truly included). Conversely, on Android we could run much of that same code from our desktop Chrome apps, on the Android device as well, within reason of course.
Now that Safari and Chrome are shipping compatible WebRTC on mobile, we get to reuse the same code, right!? Well, mostly, they are different code bases, after all.
A word about hardware acceleration.
If ubiquitous mobile video is to take off, the battery life of the device has to last more than the length of the 10 minute video call (ok, I am exaggerating a bit, but I think you get the point) and the performance needs to be at least adequate enough to distinguish facial features. My bar is set a little higher, baby steps for now.
Without h/w acceleration the CPU is likely working too hard to encode the local video and decode the inbound video + service the other processes required at the same time. That really means there needs to be hardware onboard the device dedicated to video coding. That in turn means H.264, since there are very few vendors that offer VP8 or VP9 h/w acceleration.
Question: Does this mean that mobile apps written with VP8 will not be able to deliver decent mobile video conferencing?
Answer: No, not at all, but they will likely not be as performant as those taking advantage of hardware acceleration.
Suffice to say that SVC (Scalable Video Coding) usage would be another reason why we need h/w acceleration, but that’s for another day.
Who’s using what?
The majority of desktop and mobile WebRTC apps written today, are using VP8 for video.
Since Apple and Microsoft both use H.264 and Google uses VP8 and H.264 (recently shipped Open H.264 – on the desktop and mobile). Also, many of the Enterprise RTC developers are already on that H.264 bandwagon.
Question: If Apple and Microsoft devices ship with H.264, what is the case with Google Chrome on desktops and android, are they preferencing VP8?
Answer: Chrome for desktop and android now have H.264 native. Many of the Android devices that ship today all have H.264 hardware acceleration onboard. In order to understand which units have H.264 and hardware acceleration, you can run use the Android APIs to pull a list of available codecs, but in the case of WebRTC, you will only get H.264 in Android WebRTC if there is a h/w encoder on the device.
Is H.264 the answer for WebRTC video?
Here is a recent test:
Host 1 – (before joining):
macOS Sierra, Macbook, Safari (Technology Preview 32)
Host 2 (after joining):
Android 7, Samsung 7, Chrome 55
Host 1 (after joining):
According to the Chrome Status page, Chrome for Android should have H.264. So why is the session barfing when trying to set up video? The logs do not lie…
Safari – offer:
Chrome on android – answer:
Err, huh? No H.264 in reply?
So, I updated to latest Chrome on android (58) and tried again…
… et voilà!!
Next topic, paying the man!
Shipping your product with H.264 enabled, means you may potentially need to deal with the MPEG-LA royalty police for H.264 royalties, but there are some grey areas.
In the case of Apple and Microsoft, where H.264 royalties are already being paid for by the parent vendor, the WebRTC developer is riding on the coattails of papa bear, at least in theory.
Cisco’s generous OpenH.264 offer means that those using this binary module, can do so at potentially no cost:
We will not pass on our MPEG-LA licensing costs for this module, and based on the current licensing environment, this will effectively make H.264 free for use on supported platforms.
Q: If I use the source code in my product, and then distribute that product on my own, will Cisco cover the MPEG LA licensing fees which I’d otherwise have to pay?
A: No. Cisco is only covering the licensing fees for its own binary module, and products or projects that utilize it must download it at the time the product or project is installed on the user’s computer or device. Cisco will not be liable for any licensing fees incurred by other parties.
That seems to mean (I am no lawyer) every developer shipping WebRTC apps supporting Open H.264 binary module, get a free ride. Those using some other binary, or shipping the above source code for that module, could be on the hook for those royalties. That said, since there are royalties being paid by parent vendors where devices are shipping H.264 anyways, developers may not get hassled regardless.
So what did we learn here?
- Apple has joined the party, now we have a full complement of browser vendors!
- If you want to leverage WebRTC video to deliver a ubiquitous mobile and desktop experience for your users, you should likely consider including both H.264 and VP8.
- VP8 is (still) free and powers most of the WebRTC video out there today.
- You can make use of the Open H.264 project and get a free H.264 ride, albeit baseline AVC.
- WebRTC on Android does not support software encoding of H.264, so unless there is local hardware acceleration, H.264 will not be in the offer.
- H.264 is not fully enabled (or buggy) in Chrome 55 (I was using it on Samsung S7 Edge (Android 7), but it does work with Chrome 58.
- WebRTC is not DOA!
- SDP still sucks and ORTC can’t come soon enough!!
As a side note, it would be interesting to see something like this open sourced; VP8 / H.264 conversion without transcoding, if only to service the existing desktop apps currently running VP8 <-> mobile H.264. It would likely overwhelm the mobile device, but it would be cool if it worked!
Disclaimer: The views expressed by me are mine alone and do not necessarily represent the views or opinions of my employer.
Our initial ORTC implementation includes the following components:
- ORTC API Support. Our primary focus right now is audio/video communications. We have implemented the following objects: IceGatherer, IceTransport, DtlsTransport, RtpSender, RtpReceiver, as well as the RTCStatsinterfaces that are not shown directly in the diagram.
- RTP/RTCP multiplexing is supported and is required for use with DtlsTransport. A/V multiplexing is also supported.
- STUN/TURN/ICE support. We support STUN (RFC 5389), TURN (RFC 5766) as well as ICE (RFC 5245). Within ICE, regular nomination is supported, with aggressive nomination partially supported (as a receiver). DTLS-SRTP (RFC 5764) is supported, based on DTLS 1.0 (RFC 4347).
- Codec support. For audio codecs, we support G.711, G.722, Opus and SILK. We also support Comfort Noise (CN) and DTMF according to the RTCWEB audio requirements. For video we currently support the H.264UC codec used by Skype services, supporting advanced features such as simulcast, scalable video coding and forward error correction. We’re working toward to enabling interoperable video with H.264.
W3C WebRTC working group chairs [Harald Alvestrand (Google), Stefan Håkansson (Ericsson), Erik Lagerway (Hookflash)], made a decision recently to add a new editor to the working group, as Peter St. Andre (&yet) has resigned as editor.
Bernard Aboba (Microsoft) has now been appointed as editor.
Bernard’s attention to detail and advocacy for transparency, fairness and community has been refreshing. It has been my pleasure (as chair of the W3C ORTC CG) to work with Bernard whom also is an author in the W3C ORTC CG alongside Justin Uberti and Robin Raymond (editor). I look forward to working more with him in the WG.
- Sept 30 – Oct 2 / Chicago – IIT Real-time Communications Conference
(For me, this is “the” objective gathering of the brightest technical minds in the RTC space.) Robin Raymond will be speaking on ORTC / WebRTC 1.1 and also Cloud + P2P Communications.
- 30-31 Oct / Santa Clara – W3C TPAC / WebRTC WG Meeting
(W3C Technical Plenary / Advisory Committee Meetings Week which includes WebRTC Working Group meetings. This should be a rather interesting set of meetings for the WebRTC WG, for a variety of reasons.)
(tba) Oct ? / Web – W3C ORTC Community Group Meeting
- Nov 4-6 / Santa Clara – Cloud Expo / WebRTC Summit
(One of the bigger events, plenty more happening here than just WebRTC.) Erik Lagerway will be speaking on Real-time Communications, PaaS & Cloud Communications.)
- Nov 9-14 / Honolulu – IETF Meeting 91 / RTCWEB WG Meetings
- Nov 18-20 / San Jose – WebRTCworld Conference West
- Dec 16-18 / Paris – WebRTC Conference Expo Paris
The first ORTC Public Draft Specification has been published, authored by Hookflash, Microsoft, and Google. (http://ortc.org/wp-content/uploads/2014/08/ortc.html ) This specification extends WebRTC 1.0 with new functionality to create a WebRTC 1.1 API with exceptional flexibility and no loss of compatibility.
Like WebRTC, ORTC (Object Real-time Communication) enables plugin-free real-time communications for mobile, web and cloud, but is specifically tailored to provide the direct control needed to enable advanced multimedia and conferencing features.
“We heard developers say that they wanted more direct control over the technologies available in WebRTC. At the same time, we didn’t want existing developers to have to start over with a new API. ORTC is our proposal for how we can accomplish both of these things – a new set of APIs for direct control, that builds off the existing WebRTC 1.0 API set. As an evolution of the existing API, we consider this WebRTC 1.1” comments Justin Uberti, Google Tech Lead, WebRTC. “We’re grateful to Hookflash for their work to get ORTC off the ground. They have been instrumental in making this cross-industry collaboration happen, and we look forward to continuing our work with them.”
This newly published public draft has come a long way since the W3C ORTC Community Group was formed in mid-2013. As it has progressed from an initial set of ideas to a fleshed-out draft complete enough for implementations, several companies have gotten closely involved, with Microsoft and Google now joining Hookflash as authors of the emerging specification.
The W3C ORTC Community Group now numbers more than 60 participants.
“We believe the contributions to WebRTC 1.1 / ORTC will allow web communications technology to become ubiquitous and transcend nearly all communications technologies that came before it” says Hookflash Co-founder, Erik Lagerway, “We are honored to be working with some of the brightest minds at Google, Microsoft, and the other contributing members in the ORTC CG to mature WebRTC into a universal go-to toolkit enabling communications across the globe.”
Hookflash enables real-time social, mobile, and web communications for integration of voice, video, messaging with federated identity into world leading software, enterprise, applications, networks, mobile and computing devices. Hookflash and Open Peer are trademarks of Hookflash Inc.
Developers can register at (http://fly.hookflash.me) to start using the Hookflash RTC service and toolkits today. For more information on Hookflash RTC toolkits and White Labeling please visit Hookflash http://hookflash.com.
Come and work at one of the coolest companies in the space! We’re now hiring for these development positions: iOS, Android, Node.js & C++ send us your resume: firstname.lastname@example.org.
Hookflash – Trent Johnsen
855-466-5352 Ext: 1
From an interoperability perspective, even if WebRTC & CU-RTC-Web end up competing there will be JS libraries out there that will support both. So it seems SDP is not the big issue here, but there is an elephant in the room, the media stack
Differing media stacks (specifically codecs) could cause big problems. As an example; IE & MS endpoints may support various Microsoft codecs versus WebRTC compliant endpoints (Chrome, FireFox, Opera, Mobile Apps etc.) which would presumably support the RTCWEB MTI (mandatory to implement) Video and Audio codecs.
That is if WebRTC has such codecs. We still don’t have a MTI Video codec yet! <- This has been one the most contentious issues in the IETF RTCWEB working group to date.
If we fail to deliver a MTI video codec in WebRTC what’s the likelihood of opposing browser vendors (implementing opposing standards) supporting the same codecs? Not very good odds I would expect. In which case cross-browser communication (media: audio, video) would fail.
Although, we might get lucky and have all the browser vendors select at least 1 like audio and video codec on their own accord. Ya, right.
If you don’t want to leave it entirely up to chance, get involved! Joining the IETF is free and open standards need your support if they are to succeed. The next IETF meeting could be be very telling wrt a MTI video codec: http://www.ietf.org/meeting/86/index.html
Update 2: To the hundreds/thousands of repetitive spam tweets / twits, “Will WebRTC replace / kill Skype”, the answer is NO!! It will not. WebRTC is using broken Jingle in the browser, it does not support chat and can only make and receive calls., there is no buddy / contact list to speak of etc etc. NO it will not replace Skype. Stop with the spam tweets already, please!
Update: It seems to me that until all the browsers are on board, native clients will be required to make this go. Which is not outside the realm of possibility, considering Google has open sourced the GIPS audio and video engine along with WebRTC.
Something to remember, WebRTC is not RTCWEB! It may sound silly but it’s true. WebRTC is a Google-centric project using Google code etc. RTCWEB is essentially an IETF effort, a working group driving towards open real-time communications on the web. They are not the same, which can be rather confusing.
— Original Post —
Google has been busy it would seem, last night WebRTC appeared to the public for the first time. This has some pretty serious implications for Flash, which was the de-facto technology one had to use to get real-time communications in a browser, that has now been circumvented, at least to a certain degree.
The sessions are not run by a signaling protocol per se, not Jingle, no XMPP, not SIP not anything we have seen before. All the session management looks to be coming from libjingle. Which, to me means Jingle is in the browser.
A few early comments:
1. Where does Google stand on websockets? Google have said they will block it if an exploit emerges.
2. Chrome, Opera & Firefox are the supported browsers. Where does Safari and IE land? My guess is that Microsoft will not be in any hurry to implement this considering their recent Skype acquisition.
3. Web-cam captures from HTM5 has not been ratified, although this is likely not as serious as the former points.