This is the second WebRTC Standards Update coming to you from Dan Burnett and Amir Zmora.
Topics covered in this update are:
- Is there a security vulnerability in WebRTC with regards to detecting your IP address?
- The New Public Working Drafts of Media Capture and WebRTC
- Enhancing media functionality of WebRTC
I know your public and private IP, so what?
Highlights
Lately there were several discussions and blog posts about a security vulnerability of WebRTC. The concern expressed was that websites could detect the computer’s local and public address using WebRTC APIs. A demonstration of this can be found on Github and here.
A few examples of these concerns, and instructions on how to “address” this by blocking or disabling WebRTC can be found here, and here, and on many other websites (Google will help you find them).
On the other hand, a pragmatic blog post dated September 2013 puts things in perspective. At the end of the day, the concern people have with IP address exposure is revealing of private information to websites collecting it, but other, more user specific information is collected and exposed today already.
Impact on my application
Some users may be blocking WebRTC or using extensions to do this job. This may result in failure of your WebRTC service. Detecting this in your application and providing an alert to the user could be a good measure to reduce frustration and improve user experience.
Standards status
N/A. IETF is currently not making any changes due to this.
Details
A few weeks ago several publications reported privacy concerns with WebRTC. Specifically, the concern was that a website could gain access to the user’s public and private IP addresses without having asked the user for any sort of permission. This was particularly a concern when the user was using a split-tunnel VPN* where the WebRTC Peer Connection APIs could be used to determine the public IP address of the machine itself rather than just the address of the VPN.
* Split tunnel VPN means that the user is sending some of the traffic directly to the network while sending other selective traffic of specific applications through a VPN.
This is not new information to those working on the standard. It was understood that ICE would provide host, server reflexive, and relay addresses. This was documented in the WebRTC security architecture draft in section 5.4.
Actually, any application we install on PCs or mobile devices can do this as well. The general trend of HTML5 is to power the browser with capabilities only native applications had before; this is what WebRTC is all about. However, users of web browsers may not realize that any website they visit can do this.
Discussion will likely continue on the standards lists, but for now the likely outcome is no change to the standard but rather additional protections in the browsers. Chrome, for example, may address this via a configuration setting.
New Public Working Drafts of Media Capture and WebRTC
Highlights
There was a lot of excitement lately due to the publication of 2 Working Drafts of WebRTC by the W3C. These are:
What does this mean?
Actually not that much. The day-to-day work of W3C is done in Editor’s Drafts, but once in a while these are published in Public Working Drafts. This is the case here.
However, publication of these official documents is a requirement for progression towards a W3C Recommendation; this is just how this process works. Significantly, the next such Working Draft for the Media Capture and Streams specification is likely to be a Last Call Working Draft, meaning that the specification is believed to be technically complete. This is very very close now since there are only a handful of smaller issues remaining that must be addressed.
Impact on my application
This is a good thing that one of the core WebRTC specifications is getting closer to a finalized W3C Recommendation, BUT we are not there just yet.
Standards status
Hope is that next publication of the Media Capture document will be a Last Call Working Draft; that is one step closer to a W3C Recommendation if all goes well.
Enhancing media functionality of WebRTC
Highlights
The group at W3C working on the Media Capture and Streams specification (that is the core WebRTC specification for media APIs in W3C), has just published First Public Working Drafts on media related topics of interest to many WebRTC fans. The new specifications are:
- Audio Output Devices API
- Screen Capture
- Media Capture from DOM Elements – these DOM elements are the JavaScript objects associated with the <audio>, <video> and <canvas> HTML document elements
An important point to note is that these functionalities could have been part of the core WebRTC specification documents but were instead added as extensions, via external documents.
This allows finalizing the core functionality of WebRTC without delaying it each time new functionality is required to be added. This is somewhat similar to the method of work in SIP where extensions were made to the core SIP RFC3261.
Impact on my application
These specifications add long desired media capabilities to WebRTC.
Standards status
These are new documents at W3C that have just begun the standards process. It is hard to predict at this time how long it will take to complete the process.
Details
Audio Output Devices API
Audio Output Devices API allows selecting which output device should be used to play audio on. Selecting an input device was already possible (e.g. which mic or camera to use), but selection of a specific output device was missing.
Technically speaking, this extension adds a new sinkId to <audio> and <video> elements to indicate which output device should get the audio (in case of video, the audio of the video). The identifier is a deviceId as used in the Media Capture APIs.
Screen capture
Screen capture is implemented today in some WebRTC services, but this functionality is not done in a standard way; rather, various “hacks” are used that are not always efficient and may break as browser versions are updated.
In Chrome, screen capture currently requires using an extension. The extension is verified by Google and the web application uses that extension. This means that doing screen capture requires installing an extension.
The direction the standard is going towards is based on a whitelist of web applications, which is how this is implemented today in Firefox. Google supports that standard moving in this direction.
In Firefox’s current implementation of this feature there are flags you need to configure to enable screen sharing.
The specification creates a new Media capture method getOutputMedia() that can capture the monitor, window, application, or browser. The capture can be of only the visible portion of what was requested or of the whole frame.
Media Capture from DOM Elements
Media Capture from DOM Elements allows you to capture HTML document media (media on a web page) such as audio that is being played to the speaker on a web page, video from a recorded file or things drawn on a canvas and do something with it: send it to a peer connection or change it.
Being a bit more technical, it defines a new CaptureStream() method on <audio>, <video>, and <canvas> HTML document elements that captures the output from the elements as a MediaStream suitable for use with other WebRTC APIs.
Similar to media received through getUserMedia() it can be sent to a peer connection or manipulated.
An example use case, one could add funny elements over an image displayed on a web page. This was possible before through sampling of the canvas, but with these new APIs it is easier to perform this task.
Aswath Rao says
Would the following proposal address the concerns of people concerned with web app getting access to host and server/peer reflexive addresses? Has this been discussed among standards representatives?
The proposal is users can instruct their browsers to inhibit STUN interaction and not include these addresses in ICE candidates. Additionally the standards can allow users to provision an “RTP Proxy” much like one can use SOCKS. These restrictions can be applied based on the domain where the app is running. So an internal application can benefit from using host or one of the reflexive addresses, without the safety concerns.
Amir Zmora says
These are good ideas for the browsers to implement, and they are likely to implement controls along the lines of what you are suggesting.
The standards groups have indeed been considering ideas such as this and are largely in agreement with you that these concerns need to be addressed by the users’ agents, the browsers, rather than in the standards protocols or APIs themselves.
Dan & Amir
Philipp Hancke says
It’s worth pointing out that the “webrtcblock” extension turned out to be not working… and is unmaintained so it was not even updated to address an attack it could have mitigated.
Amir Zmora says
Thanks Philipp.