File hosting for indie developers
Add comment!February 14th, 2009
This question comes up every once in a while on various indie forums: How should I host my video game? What are some good mirror services?
The cool thing about this question is that now there is pretty much one canonical answer: Amazon CloudFront.
There are a few things to take into account for a file host:
- Latency
- Throughput
- Reliability
- Cost
A local server is faster than a distant one. This is because the farther the packet needs to go, the more likely it is to get congested. It is not uncommon for packets to get delayed or get dropped completely, and due to the way TCP works, only a certain amount of packets will be sent out at a time.
Here's an images that illustrates centralized hosting. Instead of arrows, it might be more useful to illustrate the number of routers a TCP request might have to go through.
So, a possible solution would be to have many servers situated as close to the downloaders as possible. That way, if someone from San Francisco downloads your file, it is just as fast as if it was someone from Japan. Sometimes you will notice companies will list ridiculous amounts of mirrors for their large software downloads based on the location.
This is exactly what Amazon CloudFront does -- but a lot slicker. Amazon has built a number of data centers around the world which connect to various internet backbones. All you have to do is upload your file to Amazon S3 and enable Amazon CloudFront on your S3 bucket (tutorial here).
When you upload your file to Amazon's file server, Amazon will cache it at various edge locations closer to users. That means that Amazon has a number of servers around the world and will transparently mirror your file to the most efficient locations in response to demand. The way it does this is very simple. Amazon tells your DNS server (the server that translates a domain name into a raw IP address) to respond with a different address depending on where that DNS server is located. If that edge location has your file, it will serve it up locally. If not, it will get it from the central location, and next time, that local server will be ready.
The end result is that when a user goes to download your file, they are seamlessly downloading it from a local server instead of being routed to a single server however inefficient that might be.
Here's an images that illustrates a content delivery network. Each request is routed much more efficiently than if there is a single, centralized server.
But what about the cost? Traditionally, a fancy content delivery networks was prohibitively expensive. These days, "software as a service" is the new fad, and with Amazon Web Services, you pay for exactly what you use and nothing more. Given the rise in popularity of streaming video sites like Vimeo (but don't use them) bandwidth is dirt cheap now. One HD video view is about the size of 10 downloads of Lugaru, for instance.
I believe our bandwidth bill comes out to about one penny per download of Lugaru. Its so cheap, it might as well be free. So feel free to download Lugaru or one of Davids other games and test the download speed for yourself. ;)
What do you use for file hosting? If there is any interest, next week I will talk about Google App Engine, and how to give your website the same treatment that CloudFront gives to static files for free.