ORTC / WebRTC Pioneers
TMC / WebRTC World & PKE Consulting have published a WebRTC Pioneers press release following a WebRTC Pioneers dinner at WebRTC Expo in Atlanta last week, paying homage to some of the early work being done around WebRTC.
Congratulations to W3C ORTC Community Group founders & core contributors…
Robin Raymond – Hookflash
Bernard Aboba – Microsoft
Justin Uberti – Google
There are however, many names missing from this list who have had a significant impact on early work being done around WebRTC / ORTC. Peter Thatcher (Google), Emil Ivov (Jitsi) & Shijun Sun (Microsoft), Roman Shpount (TurboBridge) and Iñaki Baz Castillo immediately come to mind.
ORTC API Editor’s Draft – Last Call & WebRTC WG
The W3C ORTC Community Group has published an Editor’s Draft update to the ORTC API and is also asking for last call comments on the draft…
We are nearing completion of the ORTC specification for our initial ORTC community group draft. We are asking for last call feedback at this time. Some comments to explain some of the changes are still pending but all issues are tracked here and on github. If you would like to make comment, please have a read through and make comment either on this list or as part of our next CG meeting coming up.There are a few outstanding areas which are “pending” synchronization with WebRTC, e.g. stats, data channel, and IdP. As these have dependencies on WebRTC 1.0, we will attempt to complete the our specification as best we can but those areas will be subject to synchronization should updates come out of the WebRTC community group.Now is the time to have a read through for final review as the next stage will be to build implementations for implementation feedback.
So if you are fuzzy on where the ORTC CG is headed, you should really check out that video clip.
WebRTC Object API (alpha) – Request for Feedback
From the ORCA W3C community group…
WebRTC Object API (alpha):
Early example code, written in Node.js:
It is the current authors intent to provide an alternative to the existing WebRTC API, to allow more control to web developers looking to leverage WebRTC.
Rationale for this alternate approach can be found in the informational draft submitted to the IETF RTCWEB working group several weeks ago.
Those interested in assisting in the completion of this API reference are asked to join this community group and take part in the discussions by:
- Creating issues in github
- Contribute your time
- Contribute code
- Provide feedback in this W3C community group
- Tell others about this effort
On behalf of the current authors, thank you for your support and consideration.
Google’s open WebRTC media stack ported to QNX / Blackberry 10
The WebRTC media stack has been ported to QNX / Blackberry 10 as reported hy Hookflash in this Press Release below.
This does not mean that WebRTC browsers will now begin communicating with Blackberry apps written using the Open Peer SDK, well… not today anyhow. What it does mean is Blackberry 10 developers can write apps using this new SDK to enable P2P voice, video and messaging, across Blackberry and iOS platforms using their own user identity model or mashed up with social identities.
In the sample app (pictured above) running on a production Z10 and a Alpha Z10 device, Facebook was used to map IDs.
Here is the Press Release…
BlackBerry Live 2013, Orlando Florida – May 13, 2013 – Hookflash announces beta availability of Open Peer Software Development Kit (SDK) for BlackBerry® 10, providing developers with an effective way to integrate high quality, secure, real-time, voice, video and messaging into their own BlackBerry 10 applications.
“The Open Peer SDK for BlackBerry 10 enables a completely new generation of communications integration on the BlackBerry 10 platform,” explains Hookflash co-founder Erik Lagerway. “The Hookflash team has worked tirelessly to build this toolkit and port the WebRTC libraries to BlackBerry 10. BlackBerry developers and enterprise customers can now integrate high quality, real-time, peer-to-peer (P2P), voice, video and messaging into their own BlackBerry 10 applications. People just want good quality voice, video and text communications embedded in whatever they’re doing. Open Peer enables progressive developers in medical, finance, gaming, travel and many other verticals with this next evolution of integrated P2P communications on BlackBerry 10 smartphones.”
“BlackBerry is committed to our app partners through an open ecosystem, strong platform and commitment to supporting innovation and invention,” said Martyn Mallick, VP of Global Alliances and Business Development at BlackBerry. “We are pleased to have Hookflash bring Open Peer to BlackBerry 10, enabling developers to add rich peer-to-peer communications in their apps, and enhance the customer experience.”
The Open Peer SDK for BlackBerry 10 is the most recent addition to the Open Peer, open source family of real-time P2P communications toolkits. The BlackBerry 10 SDK joins the existing C++ and iOS SDKs already available. Mobile developers creating applications across multiple platforms can now leverage the suite of Open Peer toolkits to deliver real-time P2P communications for all of their applications. The Open Peer SDKs are available in open source and can be found on Github (http://github.com/openpeer/).
Hookflash is a globally distributed software development team building “Open Peer”, new “open” video, voice and messaging specification and software for mobile platforms and web browsers. Open Peer enables important new evolution of communications; Open, for developers and customers to create with. “Over-the-top” via the Internet, where users control their economics and quality of service. “Federated Identity” so users can find and connect without limitations of service provider’s walled gardens and operating systems and “Integrated”, communications as a native function in software and applications. Hookflash founders, lead developers and Advisors previous accomplishments include; creators of the world’s most popular softphones, built audio technology acquired and used in Skype, created technology acquired and open sourced by Google to create WebRTC, and engaged inWebRTC standards development in the IETF and W3C.
Developers can register at (http://hookflash.com/signup) to start using the Open Peer SDK today.
For more information and an Open Peer/WebRTC white paper on please visit Hookflash http://hookflash.com
855-HOOKFLASH (466-5352) ext 1
Hookflash enables real-time social, mobile, and WebRTC communications with “Open Peer” for integration of voice, video, messaging and federated identity into world leading software, enterprise, applications, networks, mobile and computing devices. Hookflash and Open Peer are trademarks of Hookflash Inc. BlackBerry and related trademarks, names and logos are the property of Research In Motion Limited. BlackBerry is not responsible for any third-party products or services. Skype is a trademark of Microsoft. Google is a trademark of Google. Other company and product names may be trademarks of their respective owners.
(full disclosure, I work for Hookflash)
SDP the WebRTC Boat Anchor
I originally created the last blog post on why I have a really strongly dislike for SDP in WebRTC / RTCWEB. I was asked by Justin Uberti to repost my sentiments to the RTCWEB IETF mailing list. This is that summary, which I will be posting to the mailing list, apologizes in advance for the length.
I’ll try to compile my various points as to why I think we need a lower level RTCWEB API that does not include SDP.
My issues with SDP can be summarized as:
- unneeded – much too high level an API
- arcane format – legacy and problematic
- lack of API contact
- doesn’t truly solve goal of compatibility to legacy systems
Some will argue we need a higher level exchange “blob” like structured data (like SDP) to get two browsers talking to each other and that it makes it simple and it’s advantageous. I think they are horribly mistaken on this front.
SDP isn’t just an opaque blob; it’s offer/answer but I’ll get to offer/answer problems later.
To address a first point: do we need such an exchange SDP “blob” format in the first place? All media can be done without SDP given an intelligent stream API. No opaque blob is necessary at all. An API already exists to create these streams (albeit somewhat lacking if we remove the SDP ‘blob’). This API helps “simplify” creating this blob for later exchange. But the blob is truly not needed. Each side could in fact create the desired streams, pass in the appropriate media information such as codecs and ICE candidates and chose the socket pair to multiplex upon. Yes, it’s a bit more low level but it certainly can be done (and cleanly).
The larger issue, should it be lower level? Yes, in my opinion it should be. Some might say it will be too complicated for “web developers”. Nonsense! Web developers are a smart bunch. Libraries like jQuery, Promises, Node, WebGL, etc are a testament to their capabilities. The argument that it would be too difficult doesn’t wash with me. Obviously, wrappers can and will be created to simplify access for those who want higher level access. Already a half-dozen companies are vying for position as “simplified” access to RTCWEB above what the current “simplified SDP” is doing already.
The API needs to be at lower level. A lower level API will increase compatibility by setting the barrier to entry lower to what is needed to be RTCWEB compliant. A lower level API will allow for more innovative thinking when building new technologies and combinations of ideas. This will not be the case if we limit ourselves by mandating SDP in WebRTC.
Imagine being able to create streams as desired and mix and match the “pin connections” of the streams however needed via mixers, and then allowing for final rendering. So long two peers agree to communicate (via ICE) and access to sensitive assets like cameras/microphones are granted, why shouldn’t the manipulation of the streams be up to the programmer?
That’s the API we need. I not only prefer it but I think its vital to the success for WebRTC. There’s no SDP offer/answer needed. There’s no shortage of really smart people out there who would know how to produce a great API proposal. There’s no security threat introduced by managing streams with a solid API.
The SDP format itself is arcane and rooted in old world legacy reasoning. One of the primary goals was to be within a really tiny packet MTU. It’s difficult to extend and the parse rules are all over the map.
RTCWEB intentionally did not dictate a stack signaling protocol – and thank you! This ensures RTCWEB isn’t tied to SIP, XMPP/Jingle or perhaps Skype on the IE browsers. That allows for untold possible signaling scenarios, from silo websites using simple web sockets to supporting other future protocols (like Open Peer).
But SDP is a signaling protocol, just at the media level. Boo! You are forcing all protocols in the future into an offer/answer model. Despite what some might say, it’s not the only model. For example, Open Peer is stateless between peers. Each side provides its expectations of what it can send and what it expects to receive and a connection is formed based on that information. Further, we do not renegotiate streams, at least not in the SDP sense.
I would agree with Microsoft’s argument that SDP creates a brittle protocol. Changes in the future will force renegotiations as a reaction to what’s going on locally into a need to maintain a paired state to the remote side (including the complexities should both sides simultaneously renegotiate). If you get into multiple party handshakes, it can be hell. We made a conscious decision to be stateless/independent in Open Peer, based on our expertise and history when working with SIP. With SDP offer/answer, I’ll be forced to use this unnatural state machine just to support offer/answer and forever hamper our ability to dynamically change offers until fully accepted in a round trip from the remote party (not to mention the complexities in group conversations). Offer / answer will add untold headaches, but I don’t want to complain just because it’s “harder” for us. It’s not just programming difficulties; it breaks our concept of a good P2P on the wire signaling protocol.
The browser vendors have to create an API anyway – from the “initiator” side to be able to get an SDP blob. But the receiver side gets all sorts of implied logic and behavior when it receives this blob. Whatever untold elements exist in the SDP, it must understand (or reject even though it might be compatible). Some might say “wonderful” as it makes it all free/easy. But like the saying goes, free has a cost. The cost is a loss of control over what behaviours are wanted by the receiving party. The remote browser has to support all features that exist in the SDP, no matter how crazy and who is to say the SDP is going to the browser anyway? You’ll get versioning issues across SDP, all bundled into the offer / answer state machine.
Developers will sneak in additional features our protocol doesn’t want or isn’t compatible to support across devices by turning on features locally in SDP for their website, where these new bits get thrown into the SDP bundle and then sent to the remote party unaware any new feature which it won’t support (at least nor properly). That local website might be very happy with their new feature working as it will work “for them” but they’ll break federation to other sites. Some clever piggybacking (inside SDP) features will be hidden from the protocol layer that transport it, which could bring down the services across domains.
What will happen when browser vendors attempt to add features? They’ll add it, but another problem that will emerge. All new features will likely end up be expressed in the SDP somehow. Those things will break services and testing with a limited set of RTCWEB compliant browsers and devices will be insufficient. This will explode compatibility problems well beyond anything the browser vendors could imagine. Basically, it will become very risky to innovate at the browser level.
With the stream primitives only, I can build whatever state machines that are needed for the particular features I need (from none when I’m just swapping codecs, to more complex when I want to do dynamic re-pinning i/o to untold mixers). Now, throw SDP in there and we have to baseline all that into a common understanding of what it means, disallowing me from doing it because its unsupported in the SDP or force me to have a hybrid where I transport this SDP with all my additional information and have to coordinate my state machine with the browsers offer/answer state machine.
In Open Peer, we setup new streams (on same sockets) and quick-swap behind the scenes if the media must change. This greatly simplifies our model for us. Would I impose our model on anyone else? No. I’m not advocating a stateless SDP model either. Again, I’m saying no media signaling is required at all.
An API only would lower the bar of browsers being able to interoperate at the media level, since they only have to be compatible at the lower stream level. This removes the concerns about SDP compatibility issues (including the untold extensions that will happen to handle more powerful features and all that it implies and complex behaviours associated with SDP offer/answer, including rollback and ‘m=’ stability). Further, steams are easy to make compatible even if individually their API sets aren’t up to par to their counterparts across browsers and devices.
This also solves an issue regarding the data channel. There is no need for the data channel to be tied to an offer/answer exchange in the media at all. They are separate things entirely (as well they should be). For example, in Open Peer’s case the data channel gets formed well in advanced of media to maintain our document subscription/notification model between peers and media channels are open, closed and swapped as required.
The moment we use SDP we have to address one fundamental issue. Is the SDP blob meant to be transported in full format only to the destination “as is” or is it okay to mess with the SDP by intermediates? This is extremely important. SDP is a not just a format. It is an extendable specification. This mean it will be extended in unknown ways and as such those extensions can affect behavior and add/change or remove functionality.
The protocol we have written won’t use SDP (unless forced by RTCWEB). We’d prefer to transport the information exchange ourselves via a more palatable and future thinking method than what SDP allows (in fact, we do use a JSON format on the wire). This means we’d likely parse and tear apart the information contained inside the SDP and then reassemble a new SDP on the remote side. But that would be a very BAD idea for us to do in practice and likely force us to use the SDP format forever and deliver this SDP blob inside our JSON format “as is”.
The reason why it’s bad to disassemble/reassemble the SDP is because it can be arbitrarily extended at will without knowledge of what is going on internally. New features could be added to the SDP without it being understood. This might sound like a beneficial feature but it’s actually dangerous.
With an API, it’s a contract. You don’t change the contact arbitrarily because it has implications. An API can be extended, but with the current API you know exactly what you get thus you can predict behavior. SDP is not such. It can be changed arbitrarily and there’s no guarantee the two will match. Intermediates can and will mess with the SDP (as demonstrated in the SIP world). If you allow 3rd party people like me to modify it, we’ll lose the additional “features” in the name of transforming SDP into something palatable/compatible. But we really shouldn’t, as it’s way too dangerous; that means SDP will be imposed at the signaling protocol layer on us even though we don’t want it.
If you allow modifications, SDP is a compatibility nightmare. I was the original author of the popular softphone client (X-Lite) and I understand the compatibility issues that happen with SDP/SIP. It was modified many different ways and extended over and over, and in crazy ways. People couldn’t even get basic things like ICE right, let alone all those crazy things they did to SDP across venders. It was a mess. Everyone extending it in every which way imaginable and it created a nightmare of issues; the formats just weren’t compatible.
As a side note, SBC (Session Border Controllers) vendors probably love SDP. That means job security. They would constantly rewrite SDP between end points to ensure ‘compatibility’. This is what RTCWEB has to look forward to if we adopt SDP permanently. Yes, be very afraid.
If browsers use SDP, they inherit this mess and explode it. And compatibility issues won’t be limited to browser SDPs. There is a huge swath of legacy systems that will deliver a mess of SDP to the browsers and SBCs that will manipulate the SDP in untold ways. This will reintroduce the problems with SIP into the world of RTCWEB. The browsers will be locked to those legacy systems and they will drag the browser back like a ball and chain and it will limit the browser’s ability to innovate.
I can’t stress this enough: SDP is not required for compatibility. In fact, it’s a hindrance to it. I think the SIP vendors think if the browsers use SDP they will gain lots more compatibility. They won’t. They are better off writing a JS library that talks the SDP they understand if they want SDP, rather than trying to mix/match browser SDP into the mix. SDP that is likely offered by RTCWEB will not be compatible anyway with many (perhaps most) legacy systems out there. Many of these systems still don’t support the latest ICE specifications and it’s been a long time standard. Imagine RTCWEB being tied down by these systems later and breaking millions of end points during a casual update of the browser.
SDP binds things that don’t need to be bound. We can negotiate all the streams independently. Maybe we want 6 video streams then suddenly want to change that to zero. Why do we need to preserve six dead video media SDP lines? With SDP, we force them all bound to this SDP bundle that has to be negotiated together. It’s wholly unneeded and doesn’t allow flexibility of the streams.
I’m going to take a draft example of what I mean:
Nothing wrong with the draft in an SDP/SIP mindset but I’m going to take it from a totally different non-SDP angle. I have to say, the ideas presented are very good. I appreciate FEC, and synchronizing streams is cool. But SDP isn’t needed to do it. Let me as the programmer worry about how to manage streams and the features on the streams and associations between the streams via an API only.
Point 4, 5 and 6 in the specification all have to do with the complexities of having to describe the intentions of mixing in SDP. So no comment beyond “don’t use SDP”.
As for 7.1 – “this is because the sender choses the SSRC” – only true because we are forced to use SDP and the assumptions is that it’s SIP. We could have the receiver dictate what the sender should use in advance of any media. In our case, we establish in advance what we want from all parties before even “ringing” the other party. We do not have SSRC collisions as we reversed the scenario allowing the receiver to pick the expected SSRC. Coordinating the streams is a problem with SIP because of how they do forking/conferencing but not for Open Peer. We do not fork like they do. We negotiate each location independently and statelessly. This specification forces this issue on Open Peer. If SIP has problems with streams arriving early to their stateful offer/answer then let them worry about “how” they intend to match the streams at a higher SDP layer and get this draft out of the RTCWEB track on the SIP track. To be clear, the proposal seems entirely reasonable and intelligent for SIP/SDP. But it’s way to SIP centric for general purpose.
On that note, I do need in the API is an ability to dictate the SSRC when I create an RTP stream for sending (should I care to do that).
7.2 Multiple render
Again this is an issue of SIP/SDP. We can control the SSRCs to split them out to allow multiplexing easily on the same RTP ports with multiple parties/sources. If given the primitives to control the streams just, this specification could be used to dictate how to negotiate issues in their space.
7.2.1 I’m feeling the pain. How about just giving me an API where I can indicate what streams are FEC associated.
7.3 Give me API to give crypto keys to RTP layer. Let me handle the fingerprint and security myself beyond that.
8. Let’s just say politely that I would not want to be the developer assigned to programming around all this stuff.
Again, a perfect illustration why I don’t want SDP.
Media is complicated for good reason as there are many untold use cases. The entire IETF/W3C discussion around video constraints illustrates some of the complexities and competing desires for just one single media type. If we tie ourselves to SDP we are limiting ourselves big time, and some of the cool future stuff will be horribly hampered by it.
To conclude, can I work around SDP? Sure, just like browsers can patch around IE 6.0. But believe that having a more media stream centric API without the SDP offer/answer will simplify your release and increase compatibility, allow for newer and stronger protocols and more importantly allow many future capabilities that others can’t imagine if you don’t attach the boat anchor that is SDP.
Microsoft feels so strongly against the current RTCWEB path they decided to go the way of counter proposal with CU-RTC Web. I can’t comment on their specification (yet) but I can understand their strong sentiment.
For those who don’t know me, I’m the Chief Architect at Hookflash and author of the new P2P protocol, Open Peer. I used to be the CTO/ Chief Scientist at Xten, now CounterPath, and I’m the original author of the X-Lite/X-PRO/eyeBeam SIP softphone clients. I wish I had given my feedback earlier on this subject. To be honest, following and participating in the standards tracks requires huge time devotion to which I’ve not had enough of and perhaps I thought (naively so) the smart people in the IETF would naturally make the wisest choices, in the best interests of all! As you can tell, I am not at all happy with where we sit today. RTCWEB / WebRTC & SDP… brutal.
NOTE: Will be cross posted to RTCWEB IETF mailing list.
Update: IETF + W3C Interim Feb 5-7: WebRTC, RTCWEB and SDP
Although we could not participate (conflict in schedule), it would seem as though there is some progress being made..
– SDP (decision made to try Plan A, if that fails try Plan B, at least we got that far)
– dataChannel (createdataChannel before createOffer or createAnswer, some talk around supporting defined protocols, decent progress here)
– trickle ICE (more definition to the protocol, great progress here)
Here are the meeting recordings from Feb 6th, and Feb 7th.
(thanks to Cullen Jennings for posting this to the mail list)
Participants in the WebRTC & RTCWEB working groups will be in Boston Feb 5-7, hosted by Acme Packet.
Topics for discussion will revolve mainly around SDP & NAT traversal – namely trickle-ICE. What’s trickle ICE? Basically, with traditional “ICE” we have to gather all the candidates before we start negotiation, whereas with “trickle ICE” we would gather candidates while we negotiate, which means setup time for calls could be markably reduced.
What we will likely not see any forward movement on at this interim meeting is any decision around MTI video codec(s).
From the IETF mail list…
Ted Hardie offered this up as a reading list…
You may also want to look at the proceedings from the Atlanta meeting;
in particular, this was suggested as good background reading:
Cullen Jennings added to that…
Some more background reading that is useful context
and Emil Imov added more re: trickle-ICE…
Just a quick note to let you know that the trickle ICE draft has been
updated as per the discussions in Atlanta’s MMUSIC session. A new draft
describing trickle ICE’s usage with SIP has also been submitted:
Hookflash posts now on Tumblr
I have decided to stop posting Hookflash content here, all of my new Hookflash posts will appear over at Tumblr.
Hookflash invites going out today
Just a quick update… We are launching the Hookflash invite preview this week. Yay! First group of users have already begun receiving their invites. There will be 60 invites going out this week so keep an eye out for it.
Remember, if you don’t make it into the first round of invites, don’t dismay there will be more coming as Hookflash for iPad makes its way into the app store. Down the road we’ll pick another batch of users from the invite list, so don’t give up there just yet 🙂
If you’d like to lend us a hand, please take this quick 3 question survey here: Hookflash Invite Preview Survey. Your answers will give us more information on how we can help you best, and we’ll use the feedback to plan our feature roadmap.
There were a few temporary issues that arose up which caused a delay for some of our planned features in this early release. Here is the revised initial set of features at launch:
Social Sign-in – Why would you want another user ID to manage?
Group Text – Across your entire social address book
HD Audio Calls – Call anyone across your entire social network
HD Video Calls – Face the name you see everyday
Social Context – Recent activity per contact
Social Caller ID – Context for your inbound call
Social Feed – What’s happening in your world
Looking forward to seeing you in Hookflash!
The Hookflash Team