AWS SDK for Java modified to run on Google App Engine

17 Nov 2010

I've been building a small experimental app on Google App Engine. At some point in the summer the app outgrew the sensible but onerous restrictions of the GAE sandbox. Specifically, I wanted to generate images larger than 1MB, and there is simply no way to do this from inside a GAE app. This was quite a major speed bump.

Many months later, and I have the app managing a pool of AWS EC2 instances to do its image generation work. The GAE app uses the AWS SDK for Java to start and stop EC2 instances, and to publish to an SQS queue. It's working well.

However, getting to this point wasn't easy, because the official AWS SDK doesn't work in GAE. That's because another of the GAE restrictions prevents applications from opening their own sockets. All network activity should be channelled through Google's UrlFetchService. If you use Java's lame URL API then this chanelling is transparent. But if, as in the AWS case, you are using full featured HTTP Client, some work is required. There are various approaches for supporting Apache HttpClient, the most popular one being to provide it with fake socket-level connections, which parse the protocol emitted by the HttpClient and reconstitute enough information to be able to call UrlFetchService. Noteably, Adrian Petrescu used this approach and forked the official AWS SDK.

I'm very grateful to Adrian for making this fork, bacause it gave me faith hat it was possible to do what I wanted to do. And early on it worked well. But I couldn't get it to call any of the SQS methods; I kept getting invalid signature errors, but using a client outside of GAE I was generating an identical signature and the requests were succeeding - it was baffling. After a mammoth debugging session, including an outing for WireShark, I tracked down the problem: the fake HTTP Connection was throwing away the whole of the path for my HTTP requests so they didn't stand a chance of succeeding; the signature is just the first thing that AWS happens to check. Looking at the source code I found an ominous comment "Other information are just ignored safely". I hacked in a line of code to stop ignoring other information, and miraculously everything started working.

I thought it would be helpful for me to fix this bug properly (with an accompanying test) and submit a patch, but it gradually dawned on me how many similar bugs could lurk in the code. Afterall, this class should really understand the whole of the HTTP protocol. Put off by the scale of the challenge, I decided to try an alternative approach. I looked at the original AWS code, and it seemed pretty well isoalted already from HttpClient.

A couple of hours later and I had my own fork, aws-sdk-for-java-on-gae. This fork completely removes Apache HttpClient and hard-wires UrlFetchService in its place. It turns out that UrlFetchService has a much better API than Apache HttpClient (at least a lot better than the 3.x version that Amazon used) so conversion was pretty straightforward. I removed a fair bit code that was irrelavent because actual connections are opaque when you use UrlFetchService, so it's simpler as well!

It seems to be working fine for me so far, but I'm sure somebody will find a similar bug somewhere, give up fixing it and make their own fork. But in the meantime, give it a try!