Configure teleport to work with sshd in short-lived ec2 instances

Teleport setup:

   /usr/local/bin/tctl version
   Teleport Enterprise v4.2.8git:v4.2.8-0-ga9015b33 go1.13.2

I was trying to configure teleport as ssh proxy to our aws ec2 instances following this document: https://gravitational.com/teleport/docs/openssh_teleport/

I am currently at the step of “ask Teleport to issue valid host certificates for all OpenSSH nodes”. However, in my case the instances are all short-lived and provisioned on demand, so we don’t know about the FQDN and ip apriori, therefore such command is not plausible for me:

tctl auth sign \
      --host=api.example.com,ssh.example.com,64.225.88.175,64.225.88.178 \
      --format=openssh

What should I do in this case?

There currently isn’t a great “here’s how to do it simply” story for this, I’m afraid, but I can give some ideas.

If you know the hostname that will be used to connect to a given instance ahead of time, then you can use tctl to issue a certificate up front and store this somewhere, then download the appropriate host certificate onto the instance as part of its bootstrapping process - for example it could download the certificate via cloud-init, or be provisioned by some kind of external config management (Ansible, Puppet, Saltstack etc). You could keep the certificates in a vault somewhere, or maybe AWS SSM parameter store.

If you have no idea whatsoever what the hostname will be ahead of time, then I guess you must have some kind of discovery service for finding the correct hosts? My recommendation would be that you integrate the process of issuing a host certificate from tctl into the bootstrap process of a new instance - for example, when the instance starts, you could have it ping a server endpoint you’ve set up with its hostname and IP addresses, then have an automated process call tctl with the appropriate arguments and pass back the certificate. It’s also possible to call tctl remotely with an identity file, although there isn’t currently any way to ensure that

This could potentially be a pub/sub type workflow, or some kind of message queue. Another option might be to write a row into a database with the hostnames and IPs, then have a job running on the auth server which picks up new rows from the database and automatically writes certs into a vault or some kind of central storage (maybe SSM). The bootstrapping process could then attempt to pull down the host certificate, update sshd_config and then restart sshd on a loop. Alternatively, you could provide an RPC service on the auth server which is allowed to request certificates.

Without knowing much about your use case, I will say that the (potentially much easier) alternative to all of this is that you run the Teleport node service on your nodes rather than using sshd - Teleport nodes can easily join a cluster automatically with a pre-assigned join token and would handle all certificate generation for you. Adding Teleport nodes is a reasonably easy step to integrate into a bootstrap process.