Aurora MySQL 8 Upgrade: Using the AWS Blue/Green Style

Here’s an anecdote from a series of upgrades I’ve performed using AWS RDS going from Aurora MySQL 5.7 to Aurora MySQL 8.0 with no downtime using AWS Blue/Green database deployments for my first time.

You might be nervous to pull the trigger on trusting this tech on a database. Should you find yourself in a similar predicament, In this post I’ll share my experience.

Why upgrade?

Any database server modification makes my palms sweaty. I tend to leave a working database server engine running as-is. Unless an unusual event happen, then making an informed decision with the customer to ponder if the tradeoffs make sense.

For me, it was this notification in the RDS console that struck a signal for further investigation.

Upgrade required for your database notification. AWS RDS Auroa MySQL 8

To my surprise, this new treatment “RDS Extended Support” and becoming auto-enrolled comes with a cost. Sure the old Rails apps were running great. That is – now at a steep 40% cost increase per month.

My frugal nature is not ok with this. I don’t know what this support would’ve done and auto-enrolling feels like dirty business (not saying it is, just my gut instinct). Technically and financially, I give AWS the benefit of the doubt that there’s good reason that I have to learn about.

Here’s how we made the decision.

My philosophy – I like to set up our customers such that AWS handles the infra for you. “You just pay the monthly bill and you need not to worry about about the app going down”, is what I tell them. But this unexpected price increase was sickening. Could I pay more attention to emails about this, yes, definitely, and I’ll work on being better at that in the future.

My mind has not changed, I think the AWS platform is incredibly reliable. I wouldn’t recommend any other service provider. Now that you see where I’m coming from. Let’s keep rolling.

I decided to go for it.

Having worked with Elastic Container Service (ECS) Blue/Green deployments for application code has proved it delivers on its promise of no downtime deployments.

But trusting this for a database cutover is scary for obvious reasons. But here’s some tidbits of information if you want to give this Blue/Green deployment database tech a run.

How does it work?

This is my high-level understanding. Please read the docs. Let’s start with what these Blue/Green terms are about.

Blue vs Green: Blue means the current DB cluster; Green means the freshly provisioned target DB cluster to upgrade to.

In the rest of this post, I’m using Blue and Green as keywords to help mitigate too much word salad.

Blue and Green clusters start by running in tandem with your web app connected to Blue while Green is replicating all the writes from the Blue binlog and not vice versa to keep the data in both clusters in parity having Blue as the primary writer. ex. a write to Green should not upstream to Blue. Copy-on-write is the strategy employed here (I believe).

So writing to Green will not affect Blue. This is a safety mechanism that allows you to test Green before committing to the switch over. If odds are not in your favor, you can abort the Blue/Green deployment to fix whatever needs fixing and start the process over, all the while your web app continues to run unaffected, connected to Blue.

Once you confirm your web app works with the Green cluster and you have confidence to make the leap. You click the “Switch over” action.

Then, AWS performs a DNS update for you. The connection string in your application needs no change despite Green being a completely new compute instance with a new IP address. Magic!

Blue/Green deployment switch over button

You may run into some snags before getting to this point. Then you click “Delete” to cancel the deployment and shutdown the Green cluster in order to fix up and prep your Blue database for the upgrade based on the errors you encounter.

Prep your database for an upgrade.

For more context, these were old containerized Rails apps on AWS infrastructure but the technique should still be the same.

Familiarizing yourself with the (1) upgrade-prechecks.log, (2) latin1 vs utf8mb4 encoding, (3) binlogs, is a good place to start. There’re sections about each of these below.

To start, it’ll be wise to make a database backup before starting any Blue/Green database deployment – just in case.

I hope things go smoothly on your first go around. If not, remember you can cancel safely.

Use upgrade-prechecks.log.

It’s imperative to know how to find the log file and understand it on your Green cluster.

You’ll find the upgrade-prechecks.log on the last page under the “Logs and event” tab.

RDS log & events tab on the database cluster

And the contents of this log will be in JSON format and look like this (cut short):


{
    "serverAddress": "/tmp%2Fmysql.sock",
    "serverVersion": "5.7.12-log - MySQL Community Server (GPL)",
    "targetVersion": "8.0.39",
    "auroraServerVersion": "2.11.1",
    "auroraTargetVersion": "3.08.0",
    "outfilePath": "/rdsdbdata/tmp/PreChecker.log",
    "checksPerformed": [
        {
            "id": "oldTemporalCheck",
            "title": "Usage of old temporal type",
            "status": "OK",
            "detectedProblems": []
        },
…
],
    "errorCount": 0,
    "warningCount": 14,
    "noticeCount": 0,
    "Summary": "No fatal errors were found that would prevent an upgrade, but some potential issues were detected. Please ensure that the reported issues are not significant before upgrading. For more information on prechecks, refer to https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/AuroraMySQL.upgrade-prechecks.descriptions.html"
}

An error count of 0 will allow you to continue with the upgrade. If there are errors you’ll have to address each one. Then look at each warning and use your best judgment based on how your application works. This is a good opportunity to write unit tests to help ensure your app doesn’t panic on these new database engine changes.

Try latin1 vs utf8mb4 encoding and column types.

The databases had Latin1 encoding on several columns. Perhaps this was because of having these Rails apps being developed using MySql 5.6 as I believe that was the default at the time. These app choked on emojis in this era which is a reason this new default was set in place.

Then upgrading MySql to 5.7 and specifying utf8 in your Rails app database.yml may default to the incompatible utf8mb3 character set encoding. This will error out the Blue/Green deployment.

I had a fulltext index on a utf8mb3 column which I needed to recreate by first dropping the index, changing the encoding of the text column to utf8mb4 and then recreating the index using the same exact name (important!).

Also, MySQL 8 won’t accept if the column’s data type is mediumtext so switching it to text is necessary. Be sure you know the max length of the contents in your columns to ensure no data is truncated.

You need binlog_format = ROW.

For Green to replicate the writes from Blue, the Blue database cluster parameter group needs the binlog_format set to ROW. It was unset by default for me. If you already have follower databases, then you’ll likely be all set.

If not, you have to set the value and reboot your Blue database cluster. This will technically cause some short downtime. It was about a 30-second procedure for me and was barely noticeable. Pages took longer to load.

Test your Green cluster locally with exploratory testing.

Once you get to the point where Green is successfully replicating Blue. Run your web app locally using the Green database connection string and use the app while inspecting the logs to confirm it works.

Switch over.

Once you’ve gone through all the steps above. Your RDS console should look similar to this. At this point, AWS will allow you to perform the switchover.

RDS console of a healthy blue green deployment

Luckily when I got to this point the app ran fine on the new Green cluster. So I did the switch over and it did what it said it would do. The DNS updated, pointing the original Blue connection string to the Green cluster and the app continued to work without even needing to redeploy the application code.

It’ll be wise to keep Blue running for a bit in case some unexpected application errors start appearing. Then you shut it down and you’ll become unenrolled from “RDS Extended Support” and you’ll start saving money.

The Green cluster now becomes the Blue cluster from this point forward. And the Blue/Green deployment is complete.

Is it worth it? Yes!

I hope these little tidbits help if you’re thinking about performing this upgrade. Of course, this is far from an exhaustive list of what could go wrong; I’m just saying it can work nicely.

For those who braved the switchover, leave a comment below about the snags you ran into to help others troubleshoot and make an informed decision. Or share what your experience was like good or bad, I’m curious :)

Conversation

Join the conversation

Your email address will not be published. Required fields are marked *