The road to the promised land.
For more than 6 years, we have been working on and looking forward to a simpler way to build RTC (Real Time Communications) applications on the web. In order for this technology to truly show its value, the major browser vendors needed to show up.
Mobile, mobile, mobile.
Now that Apple has joined the party in earnest, does the technology have the coverage required in order for developers to make good use of WebRTC on mobile devices? Let’s find out.
Until now, in order for WebRTC to work on iOS, we were relegated to wrapping WebRTC code in Objective-C and Swift, in our native iOS apps. Basically, we had to take the Chrome code and build an app that was sent to the app store for approval and wait in line, like all the other chumps (yours truly included). Conversely, on Android we could run much of that same code from our desktop Chrome apps, on the Android device as well, within reason of course.
Now that Safari and Chrome are shipping compatible WebRTC on mobile, we get to reuse the same code, right!? Well, mostly, they are different code bases, after all.
A word about hardware acceleration.
If ubiquitous mobile video is to take off, the battery life of the device has to last more than the length of the 10 minute video call (ok, I am exaggerating a bit, but I think you get the point) and the performance needs to be at least adequate enough to distinguish facial features. My bar is set a little higher, baby steps for now.
Without h/w acceleration the CPU is likely working too hard to encode the local video and decode the inbound video + service the other processes required at the same time. That really means there needs to be hardware onboard the device dedicated to video coding. That in turn means H.264, since there are very few vendors that offer VP8 or VP9 h/w acceleration.
Question: Does this mean that mobile apps written with VP8 will not be able to deliver decent mobile video conferencing?
Answer: No, not at all, but they will likely not be as performant as those taking advantage of hardware acceleration.
Suffice to say that SVC (Scalable Video Coding) usage would be another reason why we need h/w acceleration, but that’s for another day.
Who’s using what?
The majority of desktop and mobile WebRTC apps written today, are using VP8 for video.
Since Apple and Microsoft both use H.264 and Google uses VP8 and H.264 (recently shipped Open H.264 – on the desktop and mobile). Also, many of the Enterprise RTC developers are already on that H.264 bandwagon.
Question: If Apple and Microsoft devices ship with H.264, what is the case with Google Chrome on desktops and android, are they preferencing VP8?
Answer: Chrome for desktop and android now have H.264 native. Many of the Android devices that ship today all have H.264 hardware acceleration onboard. In order to understand which units have H.264 and hardware acceleration, you can run use the Android APIs to pull a list of available codecs, but in the case of WebRTC, you will only get H.264 in Android WebRTC if there is a h/w encoder on the device.
Is H.264 the answer for WebRTC video?
Here is a recent test:
Host 1 – (before joining):
macOS Sierra, Macbook, Safari (Technology Preview 32)
Host 2 (after joining):
Android 7, Samsung 7, Chrome 55
Host 1 (after joining):
According to the Chrome Status page, Chrome for Android should have H.264. So why is the session barfing when trying to set up video? The logs do not lie…
Safari – offer:
Chrome on android – answer:
Err, huh? No H.264 in reply?
So, I updated to latest Chrome on android (58) and tried again…
… et voilà!!
Next topic, paying the man!
Shipping your product with H.264 enabled, means you may potentially need to deal with the MPEG-LA royalty police for H.264 royalties, but there are some grey areas.
In the case of Apple and Microsoft, where H.264 royalties are already being paid for by the parent vendor, the WebRTC developer is riding on the coattails of papa bear, at least in theory.
Cisco’s generous OpenH.264 offer means that those using this binary module, can do so at potentially no cost:
We will not pass on our MPEG-LA licensing costs for this module, and based on the current licensing environment, this will effectively make H.264 free for use on supported platforms.
Q: If I use the source code in my product, and then distribute that product on my own, will Cisco cover the MPEG LA licensing fees which I’d otherwise have to pay?
A: No. Cisco is only covering the licensing fees for its own binary module, and products or projects that utilize it must download it at the time the product or project is installed on the user’s computer or device. Cisco will not be liable for any licensing fees incurred by other parties.
That seems to mean (I am no lawyer) every developer shipping WebRTC apps supporting Open H.264 binary module, get a free ride. Those using some other binary, or shipping the above source code for that module, could be on the hook for those royalties. That said, since there are royalties being paid by parent vendors where devices are shipping H.264 anyways, developers may not get hassled regardless.
So what did we learn here?
- Apple has joined the party, now we have a full complement of browser vendors!
- If you want to leverage WebRTC video to deliver a ubiquitous mobile and desktop experience for your users, you should likely consider including both H.264 and VP8.
- VP8 is (still) free and powers most of the WebRTC video out there today.
- You can make use of the Open H.264 project and get a free H.264 ride, albeit baseline AVC.
- WebRTC on Android does not support software encoding of H.264, so unless there is local hardware acceleration, H.264 will not be in the offer.
- H.264 is not fully enabled (or buggy) in Chrome 55 (I was using it on Samsung S7 Edge (Android 7), but it does work with Chrome 58.
- WebRTC is not DOA!
- SDP still sucks and ORTC can’t come soon enough!!
As a side note, it would be interesting to see something like this open sourced; VP8 / H.264 conversion without transcoding, if only to service the existing desktop apps currently running VP8 <-> mobile H.264. It would likely overwhelm the mobile device, but it would be cool if it worked!
Disclaimer: The views expressed by me are mine alone and do not necessarily represent the views or opinions of my employer.
The WebRTC media stack has been ported to QNX / Blackberry 10 as reported hy Hookflash in this Press Release below.
This does not mean that WebRTC browsers will now begin communicating with Blackberry apps written using the Open Peer SDK, well… not today anyhow. What it does mean is Blackberry 10 developers can write apps using this new SDK to enable P2P voice, video and messaging, across Blackberry and iOS platforms using their own user identity model or mashed up with social identities.
In the sample app (pictured above) running on a production Z10 and a Alpha Z10 device, Facebook was used to map IDs.
Here is the Press Release…
BlackBerry Live 2013, Orlando Florida – May 13, 2013 – Hookflash announces beta availability of Open Peer Software Development Kit (SDK) for BlackBerry® 10, providing developers with an effective way to integrate high quality, secure, real-time, voice, video and messaging into their own BlackBerry 10 applications.
“The Open Peer SDK for BlackBerry 10 enables a completely new generation of communications integration on the BlackBerry 10 platform,” explains Hookflash co-founder Erik Lagerway. “The Hookflash team has worked tirelessly to build this toolkit and port the WebRTC libraries to BlackBerry 10. BlackBerry developers and enterprise customers can now integrate high quality, real-time, peer-to-peer (P2P), voice, video and messaging into their own BlackBerry 10 applications. People just want good quality voice, video and text communications embedded in whatever they’re doing. Open Peer enables progressive developers in medical, finance, gaming, travel and many other verticals with this next evolution of integrated P2P communications on BlackBerry 10 smartphones.”
“BlackBerry is committed to our app partners through an open ecosystem, strong platform and commitment to supporting innovation and invention,” said Martyn Mallick, VP of Global Alliances and Business Development at BlackBerry. “We are pleased to have Hookflash bring Open Peer to BlackBerry 10, enabling developers to add rich peer-to-peer communications in their apps, and enhance the customer experience.”
The Open Peer SDK for BlackBerry 10 is the most recent addition to the Open Peer, open source family of real-time P2P communications toolkits. The BlackBerry 10 SDK joins the existing C++ and iOS SDKs already available. Mobile developers creating applications across multiple platforms can now leverage the suite of Open Peer toolkits to deliver real-time P2P communications for all of their applications. The Open Peer SDKs are available in open source and can be found on Github (http://github.com/openpeer/).
Hookflash is a globally distributed software development team building “Open Peer”, new “open” video, voice and messaging specification and software for mobile platforms and web browsers. Open Peer enables important new evolution of communications; Open, for developers and customers to create with. “Over-the-top” via the Internet, where users control their economics and quality of service. “Federated Identity” so users can find and connect without limitations of service provider’s walled gardens and operating systems and “Integrated”, communications as a native function in software and applications. Hookflash founders, lead developers and Advisors previous accomplishments include; creators of the world’s most popular softphones, built audio technology acquired and used in Skype, created technology acquired and open sourced by Google to create WebRTC, and engaged inWebRTC standards development in the IETF and W3C.
Developers can register at (http://hookflash.com/signup) to start using the Open Peer SDK today.
For more information and an Open Peer/WebRTC white paper on please visit Hookflash http://hookflash.com
855-HOOKFLASH (466-5352) ext 1
Hookflash enables real-time social, mobile, and WebRTC communications with “Open Peer” for integration of voice, video, messaging and federated identity into world leading software, enterprise, applications, networks, mobile and computing devices. Hookflash and Open Peer are trademarks of Hookflash Inc. BlackBerry and related trademarks, names and logos are the property of Research In Motion Limited. BlackBerry is not responsible for any third-party products or services. Skype is a trademark of Microsoft. Google is a trademark of Google. Other company and product names may be trademarks of their respective owners.
(full disclosure, I work for Hookflash)
People are talking about how WebRTC could in fact create more silos in communication that it potentially tears down. The fact that this video codec debate may never be resolved is not really the biggest issue, video codecs are not that easy to come by so as developers it’s likely we will all implement the most common and accessible codecs out there, including: VP8 and H.264, that is certainly the approach we are taking @hookflash.
The more glaring issue it seems could in fact center around the lack of a defined signalling protocol on the wire. Currently developers are left to their own devices (no pun intended) when identifying and signalling between endpoints in their interpretation of WebRTC. Which begs a question, “How one implementation of WebRTC communicate with another implementation of WebRTC?”
There are plenty of answers, most of them include “http, oauth, etc”. Which in itself is great, leve the developers decide, after all it’s their app! Some more telephony-centric developers will gravitate towards a SIP or Jingle implementation. But what about those who want to federate with other P2P-centric WebRTC offers out there and still maintain some sort if interoperability?
Tsahi Levent-Levi says…
I’ve been working for over a decade with SIP and H.323 – developing interoperable SDK solutions for the rest of the industry. At the end of the day, none of it mattered:
- We ended up as an industry with single vendor deployments for enterprises
- Interoperability was only skin-deep. The moment you wanted to do something real (security, collaboration, video), it just didn’t work
- Extending communication beyond the boundaries of the organization was impossible without PSTN
To me this seems awfully close to what you can achieve with WebRTC with two minor differences:
- WebRTC takes that for granted and makes a real statement of it: there is no signaling – do whatever it is you feel like
- It provides a common API with a common delivery platform (the browser)
As it stands today, there is nothing that fills that gap, but that is changing quickly. “Open Peer” is being positioned as a P2P signalling protocol on the wire for WebRTC with full control over Voice, Video, Messaging and Identities: local & social. As a founder @hookflash (creators of Open Peer), I may be somewhat biased (and sometimes I have a big mouth) but if you are building for WebRTC you really do owe it to yourself to check out Open Peer: http://openpeer.org and the Open Peer SDKs on Github.
Update 2: To the hundreds/thousands of repetitive spam tweets / twits, “Will WebRTC replace / kill Skype”, the answer is NO!! It will not. WebRTC is using broken Jingle in the browser, it does not support chat and can only make and receive calls., there is no buddy / contact list to speak of etc etc. NO it will not replace Skype. Stop with the spam tweets already, please!
Update: It seems to me that until all the browsers are on board, native clients will be required to make this go. Which is not outside the realm of possibility, considering Google has open sourced the GIPS audio and video engine along with WebRTC.
Something to remember, WebRTC is not RTCWEB! It may sound silly but it’s true. WebRTC is a Google-centric project using Google code etc. RTCWEB is essentially an IETF effort, a working group driving towards open real-time communications on the web. They are not the same, which can be rather confusing.
— Original Post —
Google has been busy it would seem, last night WebRTC appeared to the public for the first time. This has some pretty serious implications for Flash, which was the de-facto technology one had to use to get real-time communications in a browser, that has now been circumvented, at least to a certain degree.
The sessions are not run by a signaling protocol per se, not Jingle, no XMPP, not SIP not anything we have seen before. All the session management looks to be coming from libjingle. Which, to me means Jingle is in the browser.
A few early comments:
1. Where does Google stand on websockets? Google have said they will block it if an exploit emerges.
2. Chrome, Opera & Firefox are the supported browsers. Where does Safari and IE land? My guess is that Microsoft will not be in any hurry to implement this considering their recent Skype acquisition.
3. Web-cam captures from HTM5 has not been ratified, although this is likely not as serious as the former points.
It looks like the first victim in the Microsoft acquisition of Skype is Digium and the open source PBX – Asterisk. The following is an email sent to existing Skype for Asterisk users…
Skype for Asterisk will not be available for sale or activation after July 26, 2011.
Skype for Asterisk was developed by Digium in cooperation with Skype. It includes proprietary software from Skype that allows Asterisk to join the Skype network as a native client. Skype has decided not to renew the agreement that permits us to package this proprietary software. Therefore Skype for Asterisk sales and activations will cease on July 26, 2011.
This change should not affect any existing users of Skype for Asterisk. Representatives of Skype have assured us that they will continue to support and maintain the Skype for Asterisk software for a period of two years thereafter, as specified in the agreement with Digium. We expect that users of Skype for Asterisk will be able to continue using their Asterisk systems on the Skype network until at least July 26, 2013. Skype may extend this at their discretion.
Skype for Asterisk remains for sale and activation until July 26, 2011. Please complete any purchases and activations before that date.
Thank you for your business.
Digium Product Management
One has to wonder what will become of Skype Connect, Skype’s answer to SIP Trunking. Will Microsoft shut off the Skype Connect vendors (Cisco, Avaya, Grandstream, etc.) as well?
Original forum post here.
The RTCWEB Bof meeting starts in less than 5 minutes. To some, the most anticipated meeting of IETF 80.
Proposed charter (truncated)…
The WG will perform the following work:
1. Define the communication model in detail, including how session management is to occur within the model.
2. Define a security model that describes the security goals and how the communication model can achieve these goals.
3. Define how NAT and Firewall traversal is to occur.
4. Define which RTP functions and extensions that shall be supported in the client and their usage for real-time media, including media adaptation to ensure congestion safe usage.
5. Define what functionalities in the solution, such as media codecs, security algorithms, etc., that can be extended and how the extensibility mechanisms works.
6. Define a set of media formats that must or should be supported by a client to improve interoperability.
7. Define how non RTP datagram and byte stream data communication between the clients can be done securely and in a congestion safe way.
8. Provide W3C input for the APIs that comes from the communication model and the selected components and protocols that are part of the solution.
Aug 2011 Architecture and Security and Threat Model sent to W3C
Aug 2011 Use cases and Scenarios document sent to W3C
Sept 2011 Architecture and Security and Threat Model to IESG as Informational
Sept 2011 Use cases and Scenarios for RTCWeb document sent to IESG as Informational
Dec 2011 RTCWeb and Media format specification(s) to IESG as PS
Dec 2011 Information elements and events APIs Input to W3C
Apr 2012 API to Protocol mapping document submitted to the IESG as Informational (if needed)