Monthly Archives: January 2020

Moving Away from AWS

When Amazon began their project to act as the world’s online all-in-one shop, they knew they’d need to build one hell of a data center operation to cope with the demand. And they did it. Quite brilliantly, in fact.

Then they realized that by building worldwide data centers sufficient to cope with the worst of the peak demand (think: Christmas), they’d inevitably be overbuilding, leaving 95% of the capacity free most of the time. Why not do something with all that excess data center capacity…like, rent it out to other folks?

That, so the legend goes, was the genesis for what became known as Amazon Web Services (AWS), which has now grown to encompass countless services and computers spread across numerous connected data centers around the world. Their services now power everything from e-commerce to Dropbox to the Department of Defense. Indeed, if AWS ever does suffer one of their very-rare outages (the last I recall was a brief outage affecting their Virginia data center a year or so ago), it brings down significant parts of the internet.

We became a customer of AWS almost a decade ago, to help us serve up the installer images and picture disks in ComicBase over their “S3” (“Simple Storage Solution”) platform. Then, when I made the decision to move my family to Nashville and we had to split the IT operations in our California office, we decided to move our rack of web, database, and email servers up to Amazon’s cloud. AWS promised to let us spin up virtual servers and databases–essentially renting time on their hardware–and assign as much or as little resources as it took to get the job done.

It took us about a month to get the move done, and it was terrifying when we turned off the power to our local server rack (it felt like we were shutting down the business) . But to our great relief, we were able to walk over to a computer in our office outside our now-silent server room, fire up a web browser, go to, and see everything working just the way it should, hosted by Amazon’s extraordinary EC2 (“Elastic Compute Cloud”) and RDS (“Relational Database Services”). After a few weeks of making sure all was well, I and my family packed ourselves into a car, drove to Nashville, and the business carried on the entire time. We were living in the future.

So why, 3 years later, did I just spend the better part of a month moving all our infrastructure back down to our own servers again? Basically, it came down to cost, speed, and the ability to grow.

Bandwidth and Storage Costs
S3 — storing files up on Amazon’s virtual drives — is pretty cheap; what isn’t cheap is the bandwidth required to serve them up. If you download a full set of Archive Edition installers, for instance, it costs us a couple of bucks in bandwidth alone. Multiply by thousands, and things start adding up. The real killer, however, was the massive amount of web traffic caused by the combination of cover downloading and serving up image requests to image-heavy websites like and In a typical month, our data transfer is measured in the Terabytes–and the bandwidth portion of our Amazon bill definitely had moved into “ouch!” territory.

We were also paying the price for the promise we’d made to give each of our customers 2GB of allocated cloud storage to store database backups. When we were buying the hard drives ourselves, this wasn’t a super expensive proposition. But when we were now renting the space on a monthly basis from Amazon, we wound up effectively paying the price of the physical hardware many times over during the course of a year.

The Need for Speed
Our situation got tougher when we decided to add the ability to have ComicBase Pro and Archive Edition automatically generate reports for mobile use each time users saved a backup to the cloud. This let us give customers the ability to always have their data ready when they viewed your collection on their mobile devices, without needing to remember to save their reports ahead of time. It’s a cool feature–one which I use all the time to view my own collection–but it required a whole new set of constantly-running infrastructure to pull off.

Specifically, we had to create a back-end reporting process (“Jimmy” — after Jimmy Olsen, the intrepid reporter of Superman fame). Jimmy’s job is to watch for new databases that had been backed up, look through them, and generate any requested reports–many for users with tens of thousands of comics in their collections. Just getting all the picture references together to embed into one these massive reports could take 20 minutes on the virtualized Amazon systems.

Even with the “c4 large” compute-oriented server instances we wound up upgrading or Amazon account to, this was a terribly long time, and often left us with dozens of reports backed up awaiting processing. We could of course upgrade to more powerful computing instances, faster IO throughput allocations, etc., but only at an alarming increase in our already considerable monthly spend.

With terabytes of stored data, an escalating bandwidth bill, and all our plans for the future requiring far more resources than we were already using, it was time to start looking for alternatives.

Do it Yourself
When we launched ComicBase 2020 just before this past Halloween, we tried a very brief experiment in at least moving the new download images off Amazon and hosting them on a Dropbox share to save on the bandwidth bill.

The first attempt at this ended less than a day after it was begun, when I awakened to numerous complaints that our download site was offline, and a note from Dropbox letting us know that we’d (very quickly) exceeded a 200 GB/day bandwidth limit we hadn’t ever realized was part of the Dropbox service rules. (I could definitely see their point: they were also paying for S3 storage and AWS bandwidth to power their service–albeit at much lesser rates than us, thanks to bulk discounts they get on the astonishing amount of data they move on a daily basis). Unfortunately, there was no way to buy more bandwidth from Dropbox, so after one more day of, “maybe it’s just a fluke since we just launched” thinking–followed a day later by getting cut off by Dropbox again–we abandoned that experiment.

After a couple of days of moving the download images back up to S3 (and gulping as we contemplated the bandwidth bill implications), we wound up installing a new dedicated internet connection without any data caps, and quickly moved a web server to it whose sole purpose was to distribute disk image downloads.

Very quickly, however, we started the work to build custom data servers, based off the fastest hardware on the market, and stuffed full of ultra-fast NVMe SSDs (in RAID configuration, no less), as well as redundant deep storage, on-premise storage arrays, and off-premise emergency backup storage. All the money for this hardware wound up going on my Amazon Visa card, and ironically, I would up with a ton of Amazon Rewards points to spend at Christmas time, courtesy of the huge hardware spend.

After that began the work of moving first the database, then the email, web, and FTP servers down to the new hardware. I’ll spare you the horrific details here, but if anyone’s undergoing a similar move and wants tips and/or war stories, feel free to reach out. The whole thing from start to end took about 3 solid weeks, including a set of all-nighters and late-nighters over this past long weekend to do the final switch-over.

As of this morning at 2AM, we’d moved the last of the servers off of Amazon’s cloud, and are doing all our business once again, on our own hardware. Just before sitting down to write this, I scared myself silly once more as I shut down the remote computer which had been hosting and on Amazon’s cloud. And once again, I started to breathe normally again when I was able to successfully fire up a web browser in the office and see that the sites–and the business–were still running: once again on our own hardware.

So far, things seem like they’re going pretty well. The new hardware is tearing through the reporting tasks in a fraction of the time it used to take; sites are loading dramatically faster; and the only real technical issues we encountered were a few minor permission and site configuration glitches that so far have been quickly resolved.

Unless it all goes horribly pear-shaped in the next few days, I’ll be deleting our Amazon server instances entirely. While I’m definitely appreciating the new speed and flexibility the new servers are giving us (and I’m looking forward to not writing what had become our business’ biggest single check of each month), I still have to hand it to the folks at AWS: you guys do a heck of a job, and you provided a world class service when we needed you most. I also love that a little Mom-n-Pop shop like ourselves could access a data center operation that would be the envy of the largest corporate environments I’ve ever worked in. With the incredible array of services you now provide, it wouldn’t surprise me in the least if we didn’t wind up doing business again in the future.

Attack of the Script Kiddies

For the past few weeks, we’ve been engaged in a big move of our servers back down from the Amazon cloud to on-premises servers. While Amazon runs an amazing service, the bandwidth bill for ComicBase is a killer, and we can afford to throw way more processing power and disk storage at it if we simply buy the hardware than if we rent it from Amazon. By using on-premise hardware, we get to go way faster, way cheaper, and keep more control of our data.

Although I’m quite looking forward to not writing my largest single check of each month to Amazon, Running your own gear means running your own data center–with all that entails. Namely, you’re completely responsible for everything from backups to firewalls to even power. (I used to keep a generator and set of power cords at the ready back in California for when our infamous “rolling blackouts” would hit, in order to minimize server downtime).

On the backup front, we’re actually improving our position, using multiple layers of RAID, traditional disk backups, and off-site cloud storage. Basically, even if the place burns to the ground, we should be able to pick up the pieces and carry on pretty quickly.

What really gets old, however, is dealing with the network security foo. Unless you’ve run a site yourself, it’s hard to believe how fast and frequent the attacks come on every part of your system, courtesy of our friend the internet.

Mind you, these are not, for the most part, targeted attacks by the sort of ace hackers you see on TV and movies. Instead, it’s a constant barrage of “script kiddies” — drones and bored teens using automated “hacking” tools to assault virtually every surface of a publicly facing server using the computer-equivalent of auto-dialers and brute-force guessing.

Whether it’s the front-facing firewall, web sites, email servers, or what have you, looking at the logs shows that mere hours after the servers went live, they were being perpetually pounded with password-guessing attacks, attempts to relay spam, port scans, etc. None of these stood a chance in hell of succeeding (sorry, kiddies, the password to our admin account is not “password”) but it was amazing to see how quickly “virgin” servers, on new IP addresses, started getting pounded on. In one case, we started seeing automated probes of a server before it had even gone live to our own production team!

All this is to say that it’s a jungle out there, folks. For heaven’t sake use decent passwords (a good start: don’t let your password be any word that’s in a dictionary); change the default account passwords and user names for all your various networking hardware, don’t re-use passwords from system to system, and look for a good password manager to keep them all straight for yourself (I’m personally partial to 1Password, although I got hip to that program before they switched to a monthly billing model).

And yeah, watch those server logs. Most of the script-kiddie attacks are about as effective as the robocalls which start with a synthesized voice claiming, “HELLO, THIS IS IRS CALLING. YOU ARE LATE IN MAKING PAYMENT.” But we’ve also seen some more sophisticated attacks employing publicly known email addresses, names of company officers and more. Bottom line: watch yourself when you’re on the internet, and realize the scumbags are always looking for targets. Don’t make it easy on them.