What people do to break WebRTC call quality
If you are reading this blog I’m pretty sure you have already experienced some WebRTC calls.
And how was the quality of the audio? And the video?
Typically the answer to this questions is somewhere between pretty good to great.
When you dig further into the exact call scenario you discover that the call was within an island – Hangouts, Facebook Messenger…
Things get more complex when call needs to exist the WebRTC protected island.
Time to discover why.
Voice quality is determined by…
I’m talking about voice quality because even in a video call, voice quality is priority #1, video can freeze a bit or get pixelated for a short while. As long as you hear the other side well, a few glitches in video image are still acceptable.
There are several factors that determine voice quality. First and foremost is the network itself – delay, burstiness, congestion, and packet loss.
There are some things an enterprise or hosted UC provider can do about the network. It boils down to things around SD-WAN, DIA (Dedicated Internet Access) and MPLS but in scenarios covered below, one can’t assume all users are calling from within a managed network so all these options are irrelevant for this discussion.
Moreover, the access itself, e.g. WiFi, has great impact on these network impairments and all the technologies mentioned above don’t handle the device’s access to the network.
We are left to live with the network as it is.
The area where network conditions are being mitigated to avoid bad network impact on call quality is the codecs themselves and the algorithms on top of them.
WebRTC does a great job in both. It uses Opus which is the best option one can ask for when sending voice over the internet and it has good algorithms on top: echo cancellation, packet loss concealment, FEC (Forward Error Correction), adaptive jitter buffer…
Given all this, typically WebRTC call quality is really great.
Problems start when exiting the safe haven of WebRTC.
There are many voice codecs out there. WebRTC uses Opus and G.711.
G.711 is there for interoperability, when nothing else works you are left with G.711 but that’s not a good place to be in.
There needs to be a really good reason why not to use Opus (I can’t think of one). Here is why.
Opus gives the best quality for any given bitrate. The diagram below demonstrates that.
You can read more about codec comparison and actual listening tests done on the Opus official site.
Opus has a few important benefits:
- Support for a wide bitrate and varying sample rates all the way up to fullband (48KHz)
- Supports both constant and variable bit rate
- Support for a wide frame size range (how much “audio time” is contained in a frame)
- On-the-fly decisions about bitrate, bandwidth used and frame size
- Good resiliency for packet loss and packet loss concealment
So why on earth would one choose to use G.711 instead of Opus?
Let’s look at a deployment example to understand why.
Contact Center use case
The common use case of a contact center is that a user calls into a contact center where the agent is located in the contact center premises or at home.
In case of WebRTC, user calls from the enterprise website via WebRTC.
The contact center system itself is a heavy investment and it is mission critical. Updating its software doesn’t happen that often, not to talk about replacing it altogether.
So in many cases we are left with a traditional contact center and agents using IP Phones that run codecs such as G.729 or G.711.
These contact centers where built in the days people made calls from a phone, not from the Web. These calls are PSTN so no real point in using something other than G.711 anyway.
When a call to the contact center is originated from the Web and given the current contact center way of working there are the following options:
Make changes in the contact center to support Opus all the way
This means moving the agent to use a Web browser instead of phone or change agent’s phone to one that support Opus. Additionally, since the contact center system handles the voice for recording, announcements and monitoring it is required to make changes in the contact center system itself.
Hard work, cost and risk of change.
Opus from browser to the contact center, then transcode
In this case, the critical part of the communication over the Internet is done using Opus. Then, when voice enters the contact center it is transcoded to G.711/G.729. With this, there needs to be no real change on the contact center part.
Transcoding of Opus is CPU intensive, hence has cost implications and delay is added.
G.711 all the way, no transcoding
In this option the change on the contact center side is minimal. All that is required is a WebRTC GW that would terminate WebSockets and signaling, and terminate the few unique things about WebRTC.
The hard part would be termination of media encryption but there is no real way around it assuming it can’t be terminated on the agent client side.
In this option there are still a few hard, CPU hungry tasks but this is the minimal must.
Unfortunately, there are many who decide to default for this option. Hence, voice quality is not at its best in such cases.
You can go ahead and plugin these conditions and reasons to other use cases such as enterprise conferencing with guests joining from the Web.
Same excuses, same issues.
Result is that in such cases, when the network is not at its best, WebRTC call quality isn’t so as well.
WebRTC is more than just Opus and call quality
This post focused on voice quality but clearly Opus is not the only advantage of WebRTC. When one decides to look at WebRTC as just some kind of a Web Phone he misses the opportunity to leverage several great capabilities WebRTC can bring him. Examples:
- Contextual communication – Understand where on your Website user was when he initiated the call, what did he do before that, what interested him and what problems did he have
- Video – Allow the user to see the agent or have a 2 way video call to improve the communication experience and make the call more personal. Customers don’t get angry at people (agents) they see as much as they do at a anonymous person at the other end of the line
- Better service – Share your screen to improve sales or support, see user’s screen to quickly solve his problem, send him a link to a Web page for more information…. Endless ways to improve the customer service
What to do and what not to do when adopting WebRTC
The best approach would be to make all that is possible in order to enjoy the benefits of WebRTC. Don’t look at it as if you are plugging in a new interface of a Web Phone into your contact center. Start with the endless opportunities WebRTC brings to enhance the service/application you offer today.
There are always limitations and phases when adopting new technology. It is OK not to be able to do a complete forklift and rebuild your service just because WebRTC showed-up.
Even if this is the case, there are several options to adopting WebRTC, make sure you see the full picture and make the right choices. If you have questions or want to further discuss your specific needs I’ll be happy to help with that.
In any case, if you end up thinking of doing G.711 over the Internet, think again.
Kris Hopkins says
Amir: Great points. Hard to believe, we’ve actually seen case in the field where G.711 performs better over the Internet vs. Opus when going into the call center.
I agree Opus would be preferred, but latency from transcoding can be very noticeable to agents that spend their whole day on the phone. Secondly, is that on some tablets, mobile devices and older laptops, Opus is so CPU intensive that other application processes will cause audio disruption as they compete for CPU. G.711 is so computationally simple, it can be advantageous especially if calls stay within a geographic region (ex. North America). More contact centers need Opus ready phones , but change in the contact center is a slow process.
Testing is a must and not every customer environment is the same, nor is any application. Real world experience pays big dividends here.
Amir Zmora says
Thanks for your comment. A few points to note in this regards:
It would be better not to do transcoding and go Opus all the way. When this is not possible, the delay introduced due to transcoding will have less impact on call quality than doing G.711 over the Internet leg. Additionally, the delay issue will be much more managed and predictable whereas using G.711 will in some cases cause significant audio issues and in other cases things will be OK. You can of course decide in your application to switch from G.711 to Opus once call quality or network becomes an issue.
Transcoding will happen on the server side and not on the PC. Can be done in SW or in HW.
Again, best will be to do no transcoding and go Opus all the way.