Meetup Recap - AWS Aurora · David F. Severski

The local Seattle area has an active Meetup scene. As I live and work near downtown, I’m fortunate to be able to participate in meetups on a variety of topics. I’ve been remiss in blogging about them and will attempt to do mini-posts of summaries going forward, starting with last night’s AWS Meetup covering Amazon’s Aurora database service.

We were fortunate to have both a customer and an AWS perspective on Amazon’s re-imagined MySQL-based database. John Thull from Double Down Interactive kicked off a large (40-50+) crowd with an impromptu discussion on how his organization moved from EC2-Classic self-hosted MySQL 5.1 to RDS Aurora. John’s team leveraged ClassicLink for connectivity between his legacy EC2 Classic and new VPC environments, then leveraged a temporary MySQL 5.5 replica to serve as the mechanism between the 5.1 MySQL environment (too old a version for Aurora to talk to directly) and his new Aurora instances. John’s talk about going from 80%+ CPU utilization to less than 25% while gaining latency improvements of several orders of magnitude (20ms being the norm) was very impressive.

Next up was Sailesh Krishnamurthy from the AWS Aurora team proper, giving a fascinating background on how the team looked at approaching database design from a cloud perspective. The challenge the Aurora team set for themselves was rebuilding databases for what Salish called a resource abundance model instead of a resource scarcity model. Operating in a cloud environment, resources are much more flexible than a traditional on premises deployment. The Aurora team takes advantage of this by placing a modified version of MySQL’s InnoDB engine and connecting it to a microservices storage engine based upon AWS S3.

While Saliesh himself said that when he joined AWS, he thought microservices were oversold (his exact words were a bit more colorful!), by having a storage service which separates the SQL, transaction, and caching layers from page persistence, huge benefits are possible. The Aurora storage engine uses optimized (SSD) S3 storage to provide scalable (Aurora automatically grows with your use up to 64 TB per database) data persistence. In addition to multi-AZ replication and sharding baked in for free with the storage layer, this storage engine operates on a redo log model where storing the differences between pages of data is all that is streamed to/from Aurora.

I’m personally hoping to see Aurora join it’s RDS MySQL cousin as a HIPAA-eligable service in the near future. As a new offering, we should expect to see some rapid feature releases as the engineering team gets feedback from customers. I’ve already heard that multi-region and cross-account replication/restores are active efforts. Big thanks to the AWS Meetup organizers, sponsors, and speakers for this packed session!