AWS Remote Database Management Without SSH

Article summary

Drawbacks to SSH
Enter: CDK
AWS Remote Database Management Without SSH

Some AWS services have no need to be exposed to the public Internet. But you may still need to manage these services by connecting from your own computer. AWS remote database management is usually handled by either a VPN or a proxy bastion host. But VPNs tend to be complicated and potentially expensive, making bastion hosts a popular alternative.

The way this usually goes is that a bastion is set up to run a secure shell (SSH) service, accessible by the public Internet. When a connection to a private service is needed (for example, a database), you can SSH into this host, with a port forwarded to the database. Then you can connect to the database locally through the SSH tunnel.

Drawbacks to SSH

This is a feasible solution, but it has some drawbacks. One is that the SSH host must be exposed to the Internet. You can mitigate this by tightly controlling the SSH host’s security group, limiting connections to known clients (i.e. yourself). It’s not a big deal, but it does add some overhead.

Another drawback is SSH access control. Somebody has to maintain a list of authorized users, and/or control access to a shared key. Just a little more overhead.

Finally, the SSH host needs a public IP address. Because of the limited IPv4 address space, this means either using an elastic IP or employing some other shenanigans to share one with other services. This would be a moot point if the whole world were on IPv6 already, but we’re not there yet.

AWS Systems Manager (SSM), on the other hand, eliminates all of these drawbacks by cutting out the need for SSH entirely. You can even potentially cut out the need for the bastion host! SSM works by installing an agent on the private host you’re trying to connect to. But if you can’t (or don’t want to) install the agent directly on the private host, you can still set up a bastion running the agent.

Enter: CDK

If you’re used to deploying AWS resources by writing CloudFormation templates, stop doing that and use CDK instead. The process is pretty similar, except you get to describe the meaningful parts of your resources and how they relate to each other without all the noise. CDK even comes with some useful constructs, like BastionHostLinux. The only required information for installing a bastion host is the VPC, making it potentially this easy (for context, this assumes you are inside a Stack class and have already created or looked up vpc):

new aws_ec2.BastionHostLinux(this, "MyBastionHost", { vpc })

But a little customization is useful. Giving the host a name will make it easier to look up later. And we can save a little cost by reducing the instance size (bastion hosts don’t require much). This is what it might look like to set up a VPC, RDS database, and bastion host:

import { aws_ec2, aws_rds, Stack, StackProps } from "aws-cdk-lib";
import { Construct } from "constructs";
import { InstanceType } from "aws-cdk-lib/aws-ec2";

export class WaffleStack extends Stack {
  constructor(scope: Construct, id: string, props: StackProps) {
    super(scope, id, props);

    const vpc = new aws_ec2.Vpc(this, "WaffleVpc");

    const db = new aws_rds.DatabaseInstance(this, "WaffleDb", {
      engine: aws_rds.DatabaseInstanceEngine.postgres({
        version: aws_rds.PostgresEngineVersion.VER_14_2,
      }),
      instanceIdentifier: "waffle-db-instance",
      vpc: vpc,
    });

    const bastion = new aws_ec2.BastionHostLinux(this, "WaffleBastion", {
      instanceName: "o-bastion-my-bastion",
      instanceType: new InstanceType("t2.nano"),
      vpc,
    });

    bastion.connections.allowToDefaultPort(db);
  }
}

Because CDK includes a bunch of sensible defaults, you can just focus on the parts that matter. Now in order to connect to our bastion and forward a port to the database, we could look them up using the AWS console. But a little scripting will make this process a little smoother:

#!/usr/bin/env bash

# Set this to the region where resources are deployed
AWS_REGION=antarctica-west-1
# This is the bastion *instanceName*, not the name given to the CDK construct
BASTION_NAME="o-bastion-my-bastion"
# This is the database *instanceName*, not the name given to the CDK construct
DB_NAME="waffle-db-instance"
# An arbitrary local port to use for forwarding
LOCAL_PORT=5433
# The remote port that the database is listening on (5432 is the default for postgres)
REMOTE_PORT=5432

BASTION_INSTANCE_ID=$(aws ec2 describe-instances \
    --region=$AWS_REGION \
    --filter "Name=tag:Name,Values=$BASTION_NAME" \
    --query "Reservations[].Instances[?State.Name == 'running'].InstanceId[]" \
    --output text)

if [[ -z $BASTION_INSTANCE_ID ]]; then
  echo "Unable to find a running EC2 instance named $BASTION_NAME"
  exit 1
fi

DB_HOST=$(aws rds describe-db-instances \
    --region=$AWS_REGION \
    --filters "Name=db-instance-id,Values=$DB_NAME" \
    --query 'DBInstances[].Endpoint.Address' \
    --output text)

if [[ -z $DB_HOST ]]; then
  echo "Unable to find an RDS instance named $DB_NAME"
  exit 1
fi

PARAMS=$(jq -n \
    --arg remotePort $REMOTE_PORT \
    --arg localPort $LOCAL_PORT \
    --arg host $DB_HOST \
    '{"portNumber":[$remotePort],"localPortNumber":[$localPort],"host":[$host]}'
)

aws ssm start-session \
    --region=$AWS_REGION \
    --target "$BASTION_INSTANCE_ID" \
    --document-name AWS-StartPortForwardingSessionToRemoteHost \
    --parameters "$PARAMS"

The interesting part about this script is the aws ssm start-session. This command is controlled by “documents” that take various parameters, usually as a JSON object. I used jq to make this JSON a little easier to assemble, but you could just do some ugly string substitution. You’ll also need to ensure your AWS account has the required SSM permissions, configured using IAM.