Serving a photo website with Google App Engine

Google App Engine is their ‘platform as service’ offering that allows you to run your own software on Google’s infrastructure. It has serious horsepower and more-or-less perfect reliability. It's been running this site for a year.

, None

by

It's been about a year since I redesigned this site and rewrote the entire thing to operate on Google's App Engine. At the time I mentioned that I would report back soon about the experience of using App Engine to host a gallery/blog site. I didn't intend to wait a year, but the hindsight is valuable and I'm in a better position to evaluate the process now that I've lived with it for a while.

What is App Engine?

Have you ever wondered if your internet connection was broken? What did you do? If you are like a lot of people you opened up a web browser and went to Google.com because if Google doesn't load, it couldn't possibly be a problem with Google; it's much more likely that your internet service is down. That's how reliable Google's infrastructure is. It's the web's equivalent of a train that you can set your clock by.

Google has put an unfathomable amount of engineering and money into building this hyper-reliable, globe-spanning infrastructure that can withstand any amount of traffic the world can deliver. The system is also redundant, so even if entire data centers go down, Google keeps ticking. Wouldn't it be nice if you could borrow a few cycles on their server for your own app or website. Your project is probably so small that they wouldn't even notice.

That's basically what App Engine is. It is a service that lets you run your own projects on Google's infrastructure using a suite of software tools they provide for building web services.

How Does It Work?

Unlike traditional web development where you FTP html files and scripts to a server, App Engine development is done entirely on your local machine running an app that mimics the server environment. When your code is ready, you deploy it to Google as a package all at once. The development kit is available on Windows, OS X, and Linux. You chose to develop apps Python, Go, Java, or PHP. The development kit offers local versions of the various APIs like Google Datastore, so you can test all of your code before deploying it live. This is the way we should be doing web development anyway, but we all fall to the temptation to make live edits when working on our sites (don't we?). You can deploy your app in versions allowing you to test a new version on the live site before directing real traffic to it. After a year of poking at my site, it has worked well and I haven't had any downtime associated with making live edits and screwing things up.

Why Use App Engine?

Basically because shared hosting sucks and running your own dedicated server is too much work and too expensive. My previous host was well-regarded, had good customer support and was generally reliable as far as hosting goes. But the entire site was still sitting on a single hard drive of a single computer in a single location serving dozens of other sites all sharing the same resources and bandwidth, so the performance ebbed and flowed in ways completely outside my control. The tools offered by App Engine allow you to set up a blazing-fast, never-down website very affordably that can scale to any amount of traffic. And unlike a dedicated server, you don't need to worry about administering the low-level software or about securing the infrastructure. You just write code that performs the functions specific to your site and all the lower level server functions required to run a high performance service are handled by Google, who (I presume) is very good at doing this.

App Engine's tools offer more or less anything you need to run a website — storage, logging, caching, user authentication, etc. For a site like this I only need a simple database, a decent template engine and a way to serve binary files as fast as possible. App Engine offers a few different ways to store data. I am using Cloud Datastore, which is a NoSQL document database. The Python API lets you treat datastore entities like simple objects whose attributes are mapped to the database attributes. This is very easy, flexible, and fast. But it's not a relational database, so it can take a little while to get used to if you are coming to it with an SQL mindset. If you need a relational DB, Cloud SQL (basically a version of MySQL) is available to App Engine apps.

For a photographer’s website serving big images quickly has always been a struggle. Google Cloud Storage takes a lot of that pain away and so far it has been flawless.

Binary data (primarily jpegs) are stored and served using Google Cloud Storage (GCS). For a photo-heavy site, GCS offers a killer feature: Once an image is in Cloud Storage you can request a serving URL for the image through Google's Image API, which allows you to server images via Google's high performance content delivery network. By adding parameters to the URL you can serve the images in different sizes on the fly without worrying about caching or the performance issues normally associated with on-the-fly image manipulation. That means no more wrestling with PIL or ImageMagick — you upload one large image and pull various sizes for thumbnails and mobile devices without writing any additional code.

For example the images below are all created from one image by changing a parameter at the end of the url. This lets you do simple image manipulations like resizing in CSS media queries without preparing multiple versions of images. This is wonderful for responsive design.

image is 100px image is 100px image is 100px image is 100px

Additionally, the images are served from edge servers around the world so it's likely your users will be close to the server sending the largest files on your site. For a photographer's website serving big images quickly has always been a struggle. Google Cloud Storage takes a lot of that pain away and so far it has been flawless.

Side note: Getting the serving URL from GCS

It's not immediately obvious how to get the serving URL for binary files stored in GCS. So for anyone who ended up here from a search here's a quick tip. The API is a little convoluted. The URL comes from the images API, but requires a key from the blobstore API rather than the Cloud Storage API. To get it in Python you need to do something like this:

from google.appengine.ext import blobstore 
from google.appengine.api import images 
import cloudstorage as gas 

# Your app's GCS bucket and any folder structure you have used. 
filename = "bucket/folder/filename" 
blobkey = blobstore.create_gs_key('/gs' + filename) 
serving_url = images.get_serving_url(blobkey, secure_url=False)

App Engine also offers a very nice logging service. If you are used to firing up Perl and parsing the text files produced by Apache, App Engine's logging will take a little adjusting but you'll learn to love it. App Engine's control console lets you look and filter the raw logs, but you can also write your own code using the logging API to analyze traffic. Along with the standard access and error information, the logs also track the performance of your app, so it is easy to see how quickly requests are served and what resources your app is consuming.

Why Wouldn't You Want To Use App Engine?

While App Engine provides some really nice tools and great infrastructure, it doesn't really do anything out of the box. You want to serve a static HTML document? Well then you need to configure the system to do that in a yaml document. You want to do something a little more involved? Then you need to write a HTTP handler to deal with an incoming GET request at a specific URL which you've configured in the above-mentioned yaml document. It's not rocket science, but if you aren't comfortable in a language like Python, Go, or Java, you won't even be able to set up a basic site without a substantial learning curve. Also, code written for App Engine is not very portable. You write code to their specific API and if you decide to move to a new provider, you will likely need to complete re-write the backend. And since the API is very specific to App Engine, there isn't much software that is going to work off the shelf. This is a service for people who want to write their own backend code in a high-performance, scalable environment, but don't want to deal with installing MySQL and making sure their version of PHP is secure and up-to-date. Even if you are comfortable with Python, you will still probably spend some quality time with the nerds on Stack Overflow trying to figure out the nuances of the platform.

What Does It Cost?

Estimating costs for App Engine seems like black magic and when I started experimenting with it I really couldn't tell if it was going to be affordable. Seriously, read their pricing docs and try to figure out what you'll spend. It was a total unknown, but since it's easy to set budget caps and monitor costs there's no real risk of accidentally breaking the bank. Google gives apps a free quota and then charges a few cents for every little thing above that quota. This includes bandwidth, datastore space, datastore calls, and the actual runtime of your app. For the first two months I was spending about $1/day to run the site, which isn't bad, but is quite expensive compared to standard hosting. After spending a little time figuring out how to make my code more efficient and effectively cache almost every operation, I was able to whittle that down to 0. So, at least at my current usage and traffic, the site has been operating under the free quota and has cost nothing to run.

It doesn't cost anything to download the development kit and start a project on your local machine. If that sounds like fun, you can download it here: https://cloud.google.com/appengine/downloads.