I’ve recently started using [Ansible](https://github.com/ansible/ansible) to manage [Elastic Compute Cloud](https://aws.amazon.com/ec2/) (EC2) hosts on [Amazon Web Services](https://aws.amazon.com/) (AWS). While it is possible to have public IP addresses for EC2 instances on an AWS [Virtual Private Cloud](http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Introduction.html) (VPC), I opted to place the EC2 instances on a private VPC subnet which does not allow direct access from the Internet. This makes communicating with the EC2 instances a little more complicated.
While I could create a VPN connection to the VPC, this is rather cumbersome without a compatible hardware router. Instead, I opted to create a [bastion host](https://en.wikipedia.org/wiki/Bastion_host) which allows me to connect to the VPC, and communicate securely with EC2 instances over [SSH](http://www.openssh.com/).
## VPC Architecture
I run a fairly simple VPC architecture with four [subnets](http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Subnets.html), two public and two private, with one of each type paired in separate [availability zones](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html). The public subnets have direct Internet access, whereas the private subnets cannot be addressed directly, and must communicate with the Internet via a [NAT gateway](http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc-nat-gateway.html).
In the diagram, my computer at `70.80.50.30` wants to run Ansible against an EC2 instance at `172.31.50.5` in “Private Subnet 2.” `172.31.0.0/16` is a Class B private network; its addresses cannot be routed over the Internet. Furthermore, as “Private Subnet 2” does not have direct access to the Internet (it is via the NAT gateway at `172.31.32.2`), there is no way to assign a public IP address.
On this network, in order to communicate with `172.31.50.5`, my computer must either be connected to the VPC with a VPN connection, or by forwarding traffic via the bastion host. In my case, a VPN connection is not feasible, so I made use of the bastion host, which has both a publicly routable IP address (`52.89.24.1`), and a private address on the `172.16.0.0/16` network at `172.31.2.5`.
## SSH Jump Hosts
A common practice to reach hosts on an internal network which are not directly accessible is to use an [SSH jump host](https://wiki.gentoo.org/wiki/SSH_jump_host). Once an SSH connection is made to the jump hosts, additional connections can be made to hosts on the internal network from the jump host.
Generally, this looks something like:
jk@localhost:~$ ssh [email protected]
[email protected]:~$ ssh [email protected]
[email protected]:~$
This could also be simplified as one command invocation:
jk@localhost:~$ ssh -t [email protected] 'ssh [email protected]'
[email protected]:~$
(Note the `-t` to force pseudo-TTY allocation.)
The connections from the jump host to other hosts do not necessarily need to be SSH connections. For example, a socket connection can be opened:
jk@localhost:~$ ssh [email protected] 'nc 192.168.0.20 22'
SSH-2.0-OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.4
## SSH ProxyCommand
Ansible makes use of SSH to connect to remote hosts. However, it does not support configuration of an explicit SSH jump host. This would make it impossible for Ansible to connect to a private IP address without other networking (e.g. VPN) magic. Fortunately, Ansible takes common [SSH configuration options](http://man.openbsd.org/ssh_config), and will respect the contents of a system SSH configuration file.
The `ProxyCommand` option for SSH allows specifying a command to execute to connect to a remote host when connecting via SSH. This allows us to abstract the specifics of connecting to the remote host to SSH; we can get SSH to provide a jump host connection transparently to Ansible.
Essentially, `ProxyCommand` works by substituting the standard SSH socket connection with what is specified in the `ProxyCommand` option.
ssh -o ProxyCommand="ssh [email protected] 'nc 192.168.0.20 22'" ubuntu@nothing
The above command will, for example, first connect to `52.50.10.5` via SSH, and then open a socket to `192.168.0.20` on port 22. The socket connection (which is connected to the remote SSH server) is then passed to the original SSH client command invocation to utilize.
The `ProxyCommand` allows the interpolation of the original host and port to connect to with the `%h` and `%p` delimeters.
Running:
ssh -o ProxyCommand="ssh [email protected] 'nc %h %p'" [email protected]
Is equivalent to running:
ssh -o ProxyCommand="ssh [email protected] 'nc 192.168.0.20 22'" [email protected]
## SSH Configuration File
Using the `ProxyCommand` in conjunction with an SSH configuration file, we can make SSH connections to a private IP address appear seamless to whichever application is executing SSH.
For my VPC architecture described above, I could add the following to an SSH configuration file:
Host 172.31.2.5
ProxyCommand ssh [email protected] nc %h %p
This makes all SSH connections to the private IP address `172.31.2.5` seamless:
ssh -F ./my_ssh_config_file [email protected]
And, if using the default `.ssh/config` for storing your SSH configuration options, you don’t even need to specify the `-F` option:
ssh [email protected]
## All Together Now
Using the `ProxyCommand` option, it is simple to abstract away the details of the underlying connection to the EC2 instances on the private VPC subnet and allow Ansible to connect to those hosts normally. Any hosts on the private VPC subnet can be added explicitly to an SSH configuration file, or the pattern can be expanded. For example, we can apply the `ProxyCommand` option to all hosts on the `172.31.0.0/16` VPC subnet:
Host 172.31.*.*
ProxyCommand ssh [email protected] nc %h %p
When running Ansible, the hosts inventory can simply specify the private IP address (such as `172.31.2.5`) as the connection hostname/address, and SSH will handle the necessary underlying connections to the bastion host.
Generally, the system or user SSH configuration file (`~/.ssh/config`)can be used, but Ansible-specific SSH configuration options can also be included in the [`ansible.cfg` file](http://docs.ansible.com/ansible/intro_configuration.html#ssh-args).
This is particularly convenient when using [dynamic host inventories](http://docs.ansible.com/ansible/intro_dynamic_inventory.html) [with EC2](https://raw.githubusercontent.com/ansible/ansible/devel/contrib/inventory/ec2.py), which can automatically return the private IP addresses of new EC2 instances from the AWS APIs.
Additional SSH and `nc` flags can be addded to the `ProxyCommand` option to enhance flexibility.
For example, adding in `-A` to enable SSH agent forwarding, `-q` to suppress extra SSH messages, `-w` to adjust the timeout for `nc`, and any other standard SSH configuration options:
Host 172.31.*.*
User ec2-user
ProxyCommand ssh -q -A [email protected] nc -w 300 %h %p
Nice article, but it would be great to see final example of ansible.cfg
FYI, since OpenSSH 7.3 a new directive is introduced called ProxyJump, which allows to do away with ProxyCommand thus simplifying the configuration and enabling to easily chain multiple bastion hosts:
http://man.openbsd.org/ssh_config.5