Impressions of Lambda + DynamoDB for Indie Game Level Storage
I recently helped build a backend level storage for an indie game called “Editarrr” (check it out here).
The game let’s players build & submit levels for other players to play. Player scores and ratings are tracked per level and stored in the backend as well.
I used this project as an opportunity to try out some AWS technologies (API Gateway, Lambda, DynamoDB) as well as Terraform. Here are my impressions:
API Gateway + Lambda
API Gateway and Lambda are simple, work great and were easy to set up. Best of all: they’re nice & cheap. Our backend was relatively low traffic: at max we were getting 8k requests / day but usually more like 1–2k requests / day. Lambda has remained in the free tier and costed $0.00 and the API Gateway is no more than $0.06 / month. Comparatively, I estimate the smallest EC2 instance would cost us $2.62 / month.
Terraform vs CloudFormation
I had mixed feelings on Terraform. It worked well enough, but I would prefer using CloudFormation in the future. I found it challenging (and ultimately decided to not bother) to get a local dev environment scripted. Specifically, getting the API Gateway + Lambda to talk with with a local DynamoDB instance was the hardest problem. I attempted a script to cobble it all together. It runs DynamoDB from a Docker container, then manually finds the port to pass to API Gateway & Lambda. It didn’t work very well because then I had to manually sync DynanamoDB set up with the Terraform script. I think if we were using CloudFormation, a local startup script would have been doable.
The other reason I prefer CloudFormation over Terraform is that AWS actually actually shows you which services are part of your CloudFormation deployment. I don’t think there’s a way to do that with Terraform.
DynamoDB
Warning! Incoming hot-take: I wouldn’t use DynamoDB for most long-term projects with a team.
First off, let’s get this out of the way: I come from a Postgres background. This was my first foray into “NoSQL”. I’m certain my pain was at least partially a result of my SQL brain getting in the way.
A little background about DynamoDB: DynamoDB is a key-value storage where you can have a primary “partition key” (PK) and a secondary “sort key” (SK). There’s a single-table design philosophy where you keep all of your data in a single table by prefixing the key names with their type (e.g. “LEVEL#123”, “SCORE#456”). The last thing I’ll mention is that you can later add “global secondary indices” (GSIs) that let you index keys based on other attributes of Items (the stored objects) other than the main PK and SK.
A few DynamoDB design principles I gathered:
- Use as few tables as possible; keep related data together
- Effective and efficient use of DynamoDB means avoiding Scan() and avoiding the use of FilterExpressions that you know will throw away lots of the data read
- Try to not use IDs as keys if appropriate — they have limited queries they can support. But keep in mind that dynamic attributes won’t serve as keys either
- Use composite keys of attributes to guarantee uniqueness (e.g. PK: customerID#productID#countryCode, SK: orderDate)
- Partition Keys ideally are high cardinality and evenly distributed set of possible PKs to avoid hot spots (since DynamoDB actually partitions the data based on PKs, meaning hot spots don’t scale)
- Sort Keys: consider how they can be used for range queries
- Consider if there’s anything hierarchical you could represent with SKs (e.g. country#state#city).
So there’s a lot of planning you have to do for these keys, or else DynamoDB won’t work well for you. If you don’t know what your query patterns are going to be ahead of time, you’re going to have a bad time.
A few things that were a problem to me:
- Avoiding using FilterExpressions was frustrating. You basically lose the ability to naturally paginate data. I don’t see how you won’t eventually run into this problem (how can you design keys for something like a string search?)
- You have to be clever to figure out how to have effective PKs & SKs. Our Items (levels, scores, ratings) could not have any attribute that guaranteed uniqueness. Sure, you could try making a composite key. But then, that attributes is literally a concatenated strings…so you have to plan this ahead (you can NOT use individual attributes). We ultimately used IDs as keys anyway and it wasn’t really a problem for our small scale data. It was just weird that the primary PK/SK design had an ID attribute as both the primary and secondary keys.
In summary, with DynamoDB, it felt like you had to know exactly what query patterns you are going to need upfront and shouldn’t plan on adding new ones. On top of that, it felt like you had to be clever to work around basic database functionality like having IDs for your Items, paginating lists, and adding any new queries after the initial table design. I missed my Postgres world where you can just add a new index and you have the flexible SQL interface and Postgres hide all the database design details under this hood.
You can read more about the DynamoDB design decisions here (please, tell me what I did wrong!).
With all that complaining out of the way, DynamoDB wasn’t all bad:
- It was extremely easy to get started, and a key-value object storage makes it super easy to get things up and running very fast
- It’s got a nice UI for manually looking at and managing your data
- And last but not least — it was cheap!
At present we store barely over 32 MB of data in DynamoDB. At our peak season (Nov/Dec), we were getting about 1.5k queries / day. DynamoDB incurred $0.61 that month.
Comparatively, a Postgres RDS instance would be an estimated $1.17 / month.
I think DynamoDB is appropriate for small projects. At small scale and if you don’t care too much about latency, you can brute force queries to get just about anything you need done. Otherwise, I wouldn’t use it due to the inflexibility of introducing new query patterns and the amount of creativity required to put together a schema that works well.
The other case where I hear it’s useful is if your at an enormous scale of data and requests and you need low latency responses. My understanding is that DynamoDB scales up well (but I didn’t get to experience this firsthand).
Conclusions:
To sum up my experience:
- API Gateway + Lambda is simple and cheap — it’s been great for our low traffic project!
- Terraform worked well enough, but I think I’d stick to CloudFormation in the future for easier local dev setup & the CloudFormation UI tracking of resources
- DynamoDB wasn’t my favorite. I didn’t love the amount of planning ahead of time and the pain of supporting new types of queries. But I think it was the cheapest option, so it’s fine for a small project, but I wouldn’t recommend it for larger projects unless you’re going to have huge scale problems
Here’s the pricing break down of the entire backend stack if you’re curious:
If you disagree with any of my takes: tell me why I’m wrong!
If you’re curious about the code for this project, it’s on Github: https://github.com/LPGameDevs/EditarrrPublic/tree/develop/backend
And again, a link to the game itself: https://store.steampowered.com/app/2609410/Editarrr/
Thanks for reading!