What is WebRTC?

WebRTC (Web Real-Time Communication) is an open-source project that provides real-time communication (RTC) to web browsers, mobile devices, and other applications via application programming interfaces (API). It enables video and audio inside webpages or programs by allowing direct peer-to-peer communication without the need for plugins or native applications. The WebRTC specification has been published by the World Wide Web Consortium (W3C) and the Internet Engineering Task Force (IETF) and is supported by all major web browsers.

What is WebRTC.

How does the WebRTC work?

WebRTC implements two main technologies: media capture devices for accessing video cameras, microphones, and screen capture, and peer-to-peer connectivity. WebRTC is a group of JavaScript APIs that enable:

  • access to local audio and video streams, such as camera and microphone
  • audio and video communication between peers by handling the signal processing, encoding, communication, security and bandwidth management
  • bidirectional transmission of arbitrary data between peers
  • gathering statistics about the WebRTC connections.

On the other hand, WebRTC does not include functionality for signaling between peers for discovering remote peers and establishing the connection between them. Applications that use WebRTC use Interactive Connectivity Establishment (ICE) techniques for discovering peers through public ICE servers and use various protocols to manage WebRTC sessions, such as:

  • Session Initiation Protocol (SIP)
  • Extensible Messaging and Presence Protocol (XMPP)
  • Message Queuing Telemetry Transport (MQTT)
  • Matrix.

Media devices

Media devices can be accessed with JavaScript through the navigator.mediaDevices object which implements the MediaDevices object. This object enables us to enumerate all connected devices, listen for device changes and open a device to retrieve a data stream from it.

The most common way of accessing media devices' streams is through the getUserMedia() function that returns the MediaStream for the matching devices, where the MediaStreamConstraints object specifies the media devices and their requirements. The changes to media devices can be handled accordingly, by listening to the devicechange event.

const constraints = {
    'audio': {'echoCancellation': true},
    'video': {
        'width': {'min': 800},
        'height': {'min': 600}
        }
    }
}

const videoElement = document.querySelector('video#localVideo');

navigator.mediaDevices.getUserMedia(constraints)
    .then(stream => {
        console.log('Got MediaStream:', stream);
    })
    .catch(error => {
        console.error('Error accessing media devices.', error);
    });

videoElement.srcObject = stream;

navigator.mediaDevices.addEventListener('devicechange', event => {
    navigator.mediaDevices.getUserMedia(constraints)
    .then(stream => {
        console.log('Got MediaStream:', stream);
    })
    .catch(error => {
        console.error('Error accessing media devices.', error);
    });

    videoElement.srcObject = stream;
});

The MediaStream object represents the data stream of media content that is split into multiple tracks of audio and video (MediaStreamTrack). Each track can be individually muted and has its source and properties.

Peer connections

WebRTC manages connections between two users through a peer-to-peer protocol that can transmit audio, video, or arbitrary binary data. To discover how two peers can connect, both clients need to provide an ICE Server configuration. The server can be either Session Traversal Utilities for NAT (STUN) or Traversal Using Relay NAT (TURN) and its role is to exchange the provided ICE candidates between the two pears when they want to connect. This exchange is done through HTTP and is called signaling.

Each connection is handled by an RTCPeerConnection object and the constructor takes a single RTCConfiguration object as its parameter. The object defines how the connection is set up and contains information about ICE servers to use.

To establish a connection we create the RTCPeerConnection object and call the createOffer() method to create the RTCSessionDescription object that is sent over the signaling channel to the receiving side. We also create a listener to our signaling channel so that we can receive the answer to our offered session description.

async function makeCall() {
    const configuration = {'iceServers': [{'urls': 'stun:stun.l.google.com:19302'}]}
    const peerConnection = new RTCPeerConnection(configuration);
    signalingChannel.addEventListener('message', async message => {
        if (message.answer) {
            const remoteDesc = new RTCSessionDescription(message.answer);
            await peerConnection.setRemoteDescription(remoteDesc);
        }
    });
    const offer = await peerConnection.createOffer();
    await peerConnection.setLocalDescription(offer);
    signalingChannel.send({'offer': offer});
}

On the receiving side, we wait for an incoming offer and extract the RTCSessionDescription object to set up the connection and send the answer back through the signaling channel.

const peerConnection = new RTCPeerConnection(configuration);
signalingChannel.addEventListener('message', async message => {
    if (message.offer) {
        peerConnection.setRemoteDescription(new RTCSessionDescription(message.offer));
        const answer = await peerConnection.createAnswer();
        await peerConnection.setLocalDescription(answer);
        signalingChannel.send({'answer': answer});
    }
});

Once the two peers have set both the local and remote descriptions they know the capabilities of the remote peer. For connection to establish they need to get the ICE candidates for each other.

Once an RTCPeerConnection object is created, the underlying framework uses the provided ICE servers to gather ICE candidates for connectivity establishment. ICE candidates can be efficiently gathered by the peers when new remote peers get discovered through a process called "trickle ICE". Once ICE candidates are successfully exchanged between two peers the connection is finally established.

Remote streams

Once the connection is established we can add each media stream individually to the connection. Locally we receive the stream with the method getUserMedia() and then add it to the RTCPeerConnection.

const localStream = await getUserMedia({vide: true, audio: true});
const peerConnection = new RTCPeerConnection(iceConfig);
localStream.getTracks().forEach(track => {
    peerConnection.addTrack(track, localStream);
});

We can receive the remote streams by registering a listener to the RTCPeerConnection.

const remoteVideo = document.querySelector('#remoteVideo');

peerConnection.addEventListener('track', async (event) => {
    const [remoteStream] = event.streams;
    remoteVideo.srcObject = remoteStream;
});

WebRTC can transmit arbitrary data over an RTCPeerConnection by calling the function createDataChannel() and using the RTCDataChannel object, which can be used in the same way as the media stream.

TURN server

Traversal Using Relays around NAT (TURN) server is a protocol that takes care of relaying the network traffic between peers since a direct connection is often not possible. TURN servers are available only so peers can connect to them through the correct RTCConfiguration.

const iceConfiguration = {
    iceServers: [
        {
            urls: 'turn:turn-server.bunny.net:12345',
            username: 'bunny',
            credentials: 'auth-token'
        }
    ]
}

const peerConnection = new RTCPeerConnection(iceConfiguration);

WebRTC support and use cases

WebRTC is supported by all the major browsers on desktop and mobile operating systems, as follows:

  • Desktop:
    • Microsoft Edge (12+)
    • Google Chrome (28+)
    • Mozilla Firefox (22+)
    • Safari (11+)
    • Opera (18+)
    • Vivaldi (1.9+)
    • Brave
  • Android:
    • Google Chrome (28+)
    • Mozilla Firefox (24+)
    • Opera Mobile (12+)
  • Chrome OS
  • iOS (11+)
  • Tizen (3.0+)
  • GStreamer provides a free WebRTC implementation.

It supports the following video and audio codecs:

  • Video:
    • AVC/H.264
    • VP8
    • VP9
  • Audio:
    • Opus
    • G.711 PCM (A-law and µ-law)
    • G.722
    • iLBC
    • iSAC

WebRTC allows browsers to stream files directly from one to another, without the need for server-side hosting. WebRTC is also used by WebTorrent to enable peer-to-peer sharing using the BitTorrent protocol in the browser. Similarly, other file-sharing services use it to send files directly between two peers. However, they have to be both available at the time of the transmission. Content Delivery Networks also use WebRTC to transmit data between peers, so that each peer can act as an edge server, reducing the load on the network and their servers. WebRTC can also be used by applications besides browsers: for example, IoT and mobile devices.

Glossary

Streaming

A method of serving content by breaking it into smaller pieces and sending those pieces in order.

API

Architectural Programming Interface. A set of coding rules that can be used to let different programs talk to each other.

WebRTC

Web Real-Time Communication. A group of APIs that enable peer-to-peer streams.

P2P

Peer-to-Peer. A type of network connection where user devices act as servers for other devices on the network.