Porting a C++ multiplayer game to the Web with Cheerp, WebRTC and Firebase

Published in

leaningtech

9 min readSep 17, 2019

Introduction

At Leaning Technologies, we provide solutions for porting traditional desktop applications to the Web. Our C++ compiler Cheerp generates a combination of WebAssembly and JavaScript, allowing both easy interoperability with the browser and performance.

As a showcase project we decided to port a multiplayer game to the Web, and we opted for Teeworlds. Teeworlds is a retro multiplayer 2D shooting game, with a small but passionate community of players (including me!). It is lightweight in terms of both assets to download and CPU and GPU requirements. It was a perfect candidate.

We decided to use this project to experiment general solutions for porting networking code to the Web. Usually, the main routes to do so are:

XMLHttpRequest/fetch , if networking consists only of HTTP requests, or
WebSockets.

Both solutions would require to host the server component server-side, and neither of them allows UDP as the transport protocol. This is important for real-time applications like video conferencing and gaming because the delivery and ordering guarantees of TCP can get in the way of low latency.

There is a third way of using the network from the browser: WebRTC.

RTCDataChannel supports both reliable and unreliable transmission (the latter tries to use UDP as the transport protocol if possible) and can be used for communicating both with a remote server and between browsers. This means that we can port the whole application to the browser, server component included!

There is some additional complexity involved though: before two WebRTC peers can exchange data, they need to perform a relatively complex connection handshake, which involves multiple third party entities (a signalling server and one or more STUN/TURN servers).

Ideally, we would want to design a network API that uses WebRTC internally but is as close as possible to the connection-less interface of UDP Sockets.

This would allow us to reap the benefits of WebRTC, without exposing the complicated details to the application code (that we want to modify as little as possible with our porting effort).

WebRTC primer

WebRTC is a set of APIs available in browsers for peer-to-peer communication of audio, video and arbitrary data.

The connection between peers is established even in the presence of NAT on either or both sides with the help of STUN and/or TURN servers, through a mechanism called ICE. The peers exchange the ICE information and channel parameters through an SDP offer and answer.

Wow! A lot of acronyms all in one go. Let’s have a very quick review of those terms:

Session Traversal Utilities for NAT (STUN) is a protocol for traversing NATs and obtaining an (IP, port) pair to communicate directly with a host. If it succeeds, the peers can then communicate with each other by themselves.
Traversal Using Relays around NAT (TURN) is also used for traversing NATs, but it does so by relaying the data through a proxy visible by both peers. It adds latency and it is more expensive to run compared to STUN (because it is used for the whole duration of the connection), but sometimes it is the only option.
Interactive Connectivity Establishment (ICE) is used to decide the best possible method to connect two peers, given the information obtained through direct communication between the peers, and the one obtained by any number of STUN and TURN servers.
Session Description Protocol (SDP) is a format for describing the parameters of the communication channel, like the ICE candidates, the media codecs (in case of an audio/video channel), etc… One of the peers will send an SDP Offer, and the other will reply with an SDP Answer. After that, the channel is established.

In order to achieve the connection, the peers need to gather the ICE information they receive from the STUN and TURN servers and exchange it with each other.

The problem is that they have no way of communicating directly yet, so there needs to be an out-of-band mechanism for exchanging this data: a signalling server.

The signalling server can be very simple since its only job is to relay data between the peers in the handshake phase (as you can see in the following diagram).

Simplified sequence diagram of a WebRTC handshake

Teeworlds networking overview

The networking architecture of Teeworlds is straightforward:

The server and client components are two separate programs.
Clients join a game by connecting to one of several servers, each hosting a single game at a time.
All communication in a game passes through the server.
A special master server is used to collect a list of all public servers, which are shown in the game client.

By using WebRTC for communication, we can move the server component of the game to the browser, just like the client. This gives us a nice opportunity..

Going Serverless

Having no server-side logic has a nice advantage: we can deploy the whole application as static content on Github Pages, or on our own hardware behind Cloudflare, to ensure fast downloads and high uptime for free. We can basically forget about it, and if by any chance it will become popular, we don’t need to upgrade our infrastructure.

We still need to use some external infrastructure in order to make it work though:

One or more STUN servers: There are several free options here.
At least one TURN server: There are no free options available, so we can either run our own or pay for the service. Hopefully most of the time the connection can be established via the STUN servers (and be truly p2p), but this is needed as a fallback.
A signalling server: Signaling is not standardized, as opposed to the other two. What a signalling server actually needs to do depends somewhat on the application. In our case, we really just need a way to exchange a small amount of data between two peers before the actual connection.
A Teeworlds master server: this is used by the other servers to advertise themselves, and by the clients to find public servers. While not strictly needed (clients can always manually connect to a known server), it is nice to have so that player can enter games with random people.

We decided to use the free Google STUN servers, and we deployed one TURN server ourselves.

For the last two points, we used Firebase:

The Teeworlds master server is implemented very simply as a list of objects containing the info (name, IP, map, mode, …) of each active server. The servers publish and update their own object, and clients pull the entire list to display it to the player. We also show the list as HTML in the homepage, so that players can just click to a server and directly jump to a game.
The signalling is tightly tied with our socket implementation, explained in the following paragraph.

The server list, in-game and in the homepage

Socket Implementation

We want to provide an API as close as possible to the Posix UDP Sockets, in order to minimize the code changes that we need to apply.

We also want to implement the minimum amount of underlying details that are not necessary for basic network communication.

We don’t need actual routing for example: all peers are in the same “virtual LAN” associated with a particular Firebase database instance.

It follows that we don’t really need unique IP addresses: we can simply use unique Firebase key values to uniquely identify peers (akin to domain names), and every peer locally assigns a “fake” IP addresses to each key it needs to resolve. This completely removes the need to allocate IP addresses globally, which is a non-trivial task.

This is the minimum API that we need to implement:

// Create and destroy a socket
int socket();
int close(int fd);// Bind a socket to a port, and publish it on Firebase
int bind(int fd, AddrInfo* addr);// Send a packet. This lazily create a WebRTC connection to the 
// peer when necessary
int sendto(int fd, uint8_t* buf, int len, const AddrInfo* addr);// Receive the packets destined to this socket
int recvfrom(int fd, uint8_t* buf, int len, AddrInfo* addr);// Be notified when new packets arrived
int recvCallback(Callback cb);// Obtain a local ip address for this peer key
uint32_t resolve(client::String* key);// Get the peer key for this ip
String* reverseResolve(uint32_t addr);// Get the local peer key
String* local_key();// Initialize the library with the given Firebase database and 
// WebRTc connection options
void init(client::FirebaseConfig* fb, client::RTCConfiguration* ice);

The API is simple and similar to the one of Posix Sockets, but with a few key differences: callback registration, local IP allocation, and lazy connection,

Callback Registration

Even if the original program uses non-blocking I/O, the code needs to be refactored in order to run in a Web browser.

The reason for this is that the event loop of the browser is implicit and hidden from the program (be it in JavaScript or WebAssembly).

While in a native environment one would write code like this:

while(running) {
  select(...); // wait for I/O events  while(true) {
    int r = readfrom(...); // try to read
    if (r < 0 && errno == EWOULDBLOCK) // no more data available
      break;
    ...
  }  ...
}

With an implicit event loop, we must turn it into something like this:

auto cb = []() { // this will be called when new data is available
  while(true) {
    int r = readfrom(...); // try to read
    if (r < 0 && errno == EWOULDBLOCK) // no more data available
      break;
    ...
  }  ...
};
recvCallback(cb); // register the callback

Local IP Allocation

The identifiers of the nodes in our “network” are not IP addresses, but Firebase keys (they are strings that look like -LmEC50PYZLCiCP-vqde ).

This is convenient because we don’t need a mechanism to allocate IPs and make sure that they are unique (and possibly recycled when clients disconnect), but there is often still the need to identify the peers with a numeric value.

The resolve and reverseResolve functions are used exactly for that: the application somehow obtains the key string value (via a user input, or via the master server), and can convert it to an IP address for internal use. The rest of the API also takes this value as parameter for simplicity, instead of the string.

It is similar to a DNS lookup, but just local to a client.

This means that the IP addresses should not be shared by different clients, and if some sort of global id is needed it needs to be generated by some other mean.

Lazy Connection

UDP is connection-less, but as we saw WebRTC requires a lengthy connection process before data can be sent between two peers.

If we want to offer the same abstraction (sendto/recvfrom with arbitrary peers without prior connection), we need to lazily perform the connection under the hood.

This is what happens for a typical data exchange between a “server” and a “client” using UDP, and what our library must do instead:

The server calls bind() to tell the operating system that he wishes to receive packets on a given port.

Instead, we publish the open port on Firebase under the server key and listen for events on its subtree.

The server calls recvfrom() , accepting packets on that port arriving from any host.

In our case, we just check the incoming queue of packets destined to this port.

Each port has its own queue, and we prepend the WebRTC datagrams with source and destination port, so we know in which queue to forward the new packet when it arrives.

The call is non-blocking, so if no packets are there, we simply return -1 and set errno=EWOULDBLOCK .

The client obtains the server IP and port by some external means, and calls sendto() . This will also do a bind() behind the scenes, so a subsequent recvfrom() will receive the response, without an explicit bind.

In our case, the client will externally obtain a string key, and will use the resolve() function to get an IP address.

Also, this is the point in which we start the WebRTC handshake if the two peers are not already connected to each other. Connections to different ports on the same peer use the same underlying WebRTC DataChannel.

We also do an implicit bind() , so the server can re-establish the connection on its next sendto() in case it is shut down for any reason.

The server is notified of the client connection when the client writes its SDP offer under the server port info on Firebase, and the server replies with its answer there as well.

The following diagram represent an example of the message flow for the socket setup and the first message sent from client to server:

Full diagram of the connection phase between a client and server

Conclusion

If you made it this far, you are probably interested to see the theory in action. The game is playable at teeworlds.leaningtech.com , so give it a try!

A friendly match between colleagues

The code of the networking library is freely available on Github. Feel free to jump on our Gitter channel for a chat!