Now that things are more or less back to normal, here is a little answer to “What the heck happened!”
It was a combination of unchecked growth of the site and a very high spike in traffic. The short version is that at about 1:30 AM my time (PST) the server was getting hammered with traffic and as a result, some local backups and log files plus all the site content ended up filling the disk space to capacity. This resulted in the database going nuts!
Wago was running on a single server - all processing, database lookups file downloads running through a single machine. This was all fine until ToS opened.
Illidan was right. Wago was not at all prepared for this!
It took some hours of mulling about, restoring, repairing and doing everything I could to get Wago back up and running again. There was so much traffic trying to get in that my initial fixes just ended up crashing again because the database couldn’t handle the number of requests.
I had to make more drastic steps and fully embrace the cloud. After re-writing what I needed to, Wago is now utilizing seven servers: five database servers and two web servers, and I am able to add more as necessary (when the next raid is released) relatively easily.
Let’s finish by taking a look at some graphs, everyone likes those. This is the spike that caused Wago to crash: