Filter by topic and date
WebRTC technologies prove to be essential during pandemic
- Grant Gross IETF Blog Reporter
8 Dec 2020
WebRTC may arguably be the most important set of technologies used during the COVID-19 pandemic. All web-based videoconferencing services make use of WebRTC, a large set of technologies that allow web browsers to make voice, video, and real-time data calls. WebRTC protocols were developed at the IETF.
The IETF Blog recently interviewed Adam Roach, who edited several of the WebRTC specifications in the IETF, was part of the team that implemented WebRTC in Firefox, and served as area director for the Applications and Real-Time area in the IETF from 2017 to 2020. This is an edited version of that email conversation.
IETF Blog: Can you explain what WebRTC is and what it does?
Roach: There are a few different answers to that question. Technically, WebRTC is a set of technologies that allow web browsers to make voice, video, and real-time data calls, both to other browsers, and to non-browser endpoints. It includes a lot of wire protocols that were developed in the IETF, along with web browser APIs (developed at the World Wide Web Consortium) that allow web pages to make use of those protocols.
Less formally, people often use “WebRTC” to also refer to Google's WebRTC library, which is the actual implementation of those IETF wire protocols, and which has been incorporated into uncountable apps—mostly mobile apps—to enable real-time voice and video communication. For example, WhatsApp, Google Duo, SnapChat, and Discord's mobile apps all include the WebRTC.org stack. But the key commonality is that these apps, if they chose to, could stream voice and video to and from all modern web browsers without having to perform any expensive transformation of the underlying network streams.
IETF Blog: How widely are the WebRTC standards used?
Roach: WebRTC's adoption is huge. Most modern services that use voice or video are either based on the WebRTC protocols, or have the ability to use them in addition to the native protocols they originally deployed with. Webex, for example, has a WebRTC client that lets people participate in conferences directly from their browser without downloading any additional software. And newer services, like whereby.com and Jitsi, have been natively based on WebRTC from their outset. Even when no web browser is involved, major services are using WebRTC for video transmission; for example, the service that runs on Amazon's devices that lets you view webcams and doorbell cameras uses WebRTC to receive video from them. And, increasingly, new Internet of Things products that want to stream voice and/or video are basing their network stack on the WebRTC protocols.
IETF Blog: What Web-based conferencing tools use WebRTC-based tools?
Roach: Today, all web-based conferencing tools use WebRTC. There used to be a time where web browsers had an API called “NPAPI” that would allow third-party software to run native code as browser plugins, and this is how the earliest web-based conferencing worked. This is, for example, how the first version of Google Hangouts was implemented. But over time—as web browsers became more capable and gradually replaced all of the functionality that third-party plugins needed with native web APIs—the NPAPI plugin has been increasingly deprecated, and gradually removed. The last vestiges of NPAPI will be removed from all major modern browsers in the next couple of months, coinciding with Adobe's end-of-life plan for Flash. And with that removal, the only way to run voice and video conferences in browsers will be WebRTC.
IETF Blog: What’s the importance of WebRTC?
Roach: The development of WebRTC did several things that moved the state of the art for voice and video communications forward. The most obvious, user-facing benefit is the ability to participate in voice and video sessions directly from a web browser without having to download any additional software.
But more importantly, WebRTC mandated that all voice, video, and real-time data sent across the network must be encrypted, and that the encryption key negotiation must take place directly between endpoints rather than sending the media encryption key to every server involved in setting up the call. This is a gigantic leap forward for voice and video communications over the Internet, as all applications that use WebRTC now ensure confidentiality as they pass through local networks (like coffee shop Wi-Fi), Internet service providers, and core Internet backbone services.
And, unlike previous attempts to require encryption, the WebRTC protocol didn't just make it mandatory to implement; it made it actually mandatory to use.
IETF Blog: Does the COVID-19 pandemic highlight the importance of these standards?
Roach: Definitely. With huge segments of the population—from office workers to schoolchildren—using online video conferences literally on a daily basis, the importance of ensuring that you and your children can't be spied by casual network eavesdroppers has been thrown into stark relief. Literally the safety of your kids and the trade secrets of your company are at stake; and, if implemented in the way that they were before WebRTC, there's a very real risk that the protections that WebRTC requires wouldn't be part of these services.
Also, by picking a high-quality winner in the voice codec arena, WebRTC has paved the way for more productive videoconferencing. Multiple studies have concluded that the quality of audio codec used in communications can have a significant impact on cognitive processing, with lower-quality codecs causing non-trivial problems for the listener to perform thinking-related tasks, even when they are unrelated to the audio itself.
IETF Blog: WebRTC is a large cluster of related RFCs that define standards for different technologies and protocols. Has that made the management of the cluster difficult? What would you change if you had to do it all over again?
Roach: WebRTC is part of a group of documents known as “Cluster 238.” When the RFC Editor receives a set of documents that are mutually interdependent on each other, they will group them into numbered clusters that are processed by the RFC Production Center as a single unit. Typically, these clusters consist of two to five documents, with really large ones ranging up to 20 documents or so. Cluster 238 consists of 75 inter-related documents …
The size of the cluster has posed a unique problem for the RFC Production Center (RPC). Typically, one of the major reasons that such RFCs are published at the same time is that doing so allows the document editors working for the RPC to check for inconsistencies in terminology (and sometimes even in technical specification) between the various documents in a cluster. Of course, in a cluster as large as 238, this becomes a Herculean task; and as amazing as our RPC staff are, getting all of this right is an incredibly heavy lift for them.
I think one of the biggest things that would have helped would have been a very intentional curation, both by working group chairs and by area directors, of which documents normatively reference which other documents. A push towards minimizing such references, and a ruthless elimination of cyclical dependencies—even when this required moving large bodies of text from one document to another or even between working groups—would have led to a set of documents that could have been broken down into sub-clusters that were published incrementally.