Hosting ParadiseSS13
With the old host stepping down around April 2021, I offered to take up the mantle. Hosting a game server system for thousands of players around the world is not easy, but I took it anyway, because no one else was able to.
To summarise the infrastructure, we have:
- 18 servers combining physical and virtual machines, spanning 10 datacentres over 3 continents (North America, Europe and Australia), and 4 providers (ReliableSite, Oracle Cloud, Hetzner, OVH)
-
A mix of custom written and industry standard software, including but not limited to
- Proxmox VE Hypervisor
- pfSense Router
- MariaDB
- Redis
-
Apache Webserve
- MediaWiki
- Invision Community
- tgstation-server
- The full Elasticsearch/Logstash/Kibana stack
- GitLab
- PRTG Network Monitor (network map)
- HAproxy
The main hypervisor is located in New York, to aid connectivity of North America, and runs most of the core systems over several VMs. These are:
-
The router VM (pfSense)
- Provides a firewall, NAT for the rest of the VMs, and an OpenVPN server for management tasks to remove the need to have management ports exposed.
-
The core VM (Ubuntu Server)
- Runs the database, redis server, several pieces of custom written internal tooling, and is the centralised data hub for the rest of the infrastructure.
-
The webserver VM (Ubuntu Server)
- Runs the webserver, and thats it.
- This is its own VM for security reasons as webapps are a large attack vector.
-
The game VM (Windows Server Core)
- Runs the gameserver, and thats it.
- his is in its own VM due to a windows requirement and separation for maximum performance.
-
The analytics VM (Ubuntu Server)
- Runs the Elastic stack for analysis of logs and metrics.
- This is in its own VM so I can delegate control to someone with Elastic certifications, and put a hard limit on Elasticsearch storage.
-
The GitLab VM (Ubuntu Server)
- Runs GitLab, and thats it.
- This is its own VM for security reasons with the GitLab runner, and for resource confinement as GitLab likes to consume lots of resources
-
The monitoring VM (Windows Server Core)
- Runs the PRTG web service and PRTG probe.
- This is in its own VM due to a windows requirement and not wanting to bog down the game VM.
-
The stats frontend VM (Ubuntu Server)
- Runs a custom stats page for game stats.
- This is in its own VM so I can delegate it to the stats page developer to make managing python modules and CD easier.
The other main server here is the offsite backup, located in Germany for maximum geo-resilience. This takes snapshots of core directories (SQL backups, game logs, webserver files) daily, allowing for history. The backups are done on a pull-system as opposed to the system, with the backup server having read access to the servers it needs to backup from, rather than the systems needing to be backed up having write access to the backup server. This way, if the primary server gets compromised either by network or an attack on the provider itself (the backup server is with a separate provider), there is no way to wipe out the backup data as it cannot even see it, let alone perform write or delete operations on it.
The remaining servers in each regions are proxy nodes. As well as the main server in New York being able to take incoming connections, there are relays situated in the following regions:
- US-West (California)
- US-East (Virginia)
- UK (London)
- EU-West (France)
- EU-Central (Poland)
- Australia (Sydney)
This relay system could be accomplished with existing cloud PaaS systems such as AWS Global Accelerator, however that would cost ~$400 USD a month with the traffic we shift, and this VM solution costs <$30 USD a month.