Kubernetes proxy architecture

Hello,

I just deployed teleport auth server + 1 teleport proxy living on an EC2 instance + 1 teleport proxy deployed as a pod into a kubernetes cluster. My question is: to have access to kubernetes api trough teleport, I would need to point the tsh login to the proxy living on the EKS cluster right?

I’m asking this because I would like to know if it is possible to have a proxy deployed into kubernetes (as a pod) and access the kubernetes api by login to the other proxy that I’ve deployed on EC2. This way, I don’t need to expose the pod proxy to the internet with an ingress controller.

According to the docs, one approach that I can do is to, instead of deploy one proxy as a pod, is to configure the proxy deployed under EC2 to connect to kubernetes. But the problem is that I need to share the kubeconfig with it. I would like to avoid it…

You could configure the in-pod Teleport proxy instance to also run a Teleport auth server, then configure this cluster to be a trusted cluster attached to the main Teleport proxy in EC2.

This would mean that you follow this process:

  1. Log into your main/root EC2 cluster (tsh login --proxy=main.example.com)
  2. Switch to your trusted/leaf in-pod Kubernetes cluster (tsh login --proxy=main.example.com trusted.example.com)
  3. Run kubectl commands

In this setup, all traffic goes to the Teleport Kubernetes proxy on the main cluster (port 3026) which then proxies traffic in and out of the Kubernetes cluster. You’d still need to expose port 3026 on the Kubernetes pod via a Kubernetes LoadBalancer , Ingress or similar, but you’d only need to permit traffic to here from your main EC2 instance which would no doubt be preferable to exposing it to the public internet.

Ok, tks for the clarification @gus. We would have 4 different k8s clusters in our infrastructure. Using your approach is fine. But now, imagine that I can securely share the kubeconfig with the EC2 proxy instance. Could the shared kubeconfig contain all of the 4 clusters configuration with it? How teleport would handle it in the client side? Would we be able to choose which k8s context to connect to?

Maybe this could be a simpler solution…

Teleport would just use whatever the current-context is set to within the kubeconfig - it doesn’t support multiple clusters. You would need 4 separate proxies with 4 separate kubeconfig files in that case.

I would suggest that a Teleport auth/proxy combination running inside the Kubernetes cluster, each configured as a trusted cluster back to the main cluster would definitely be a good plan. You could probably in fact run just one central auth server and four proxies, as long as your security groups/firewall permitted the pod-based proxies to make connections back to the auth server.

1 Like

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Hello @gus

I’ve successfully added the teleport cluster that I spun as a trusted cluster into my main cluster. But I saw a problem that I can’t solve alone. You said that "(…) all traffic goes to the Teleport Kubernetes proxy on the main cluster (port 3026) which then proxies traffic in and out of the Kubernetes cluster(…) ".

Based on what you wrote, I need to configure the proxy of the main cluster, which lives in an EC2 instance, to be able to receive connections on TCP 3026. But, how can I do it without having to share the kubeconfig from the k8s cluster? If I just add the kubernetes section within the proxy section without the kubeconfig, the main proxy will die with this error message:

User Message: auth server assumed that it is
running in a kubernetes cluster, but /var/run/secrets/kubernetes.io/serviceaccount/ca.crt mounted in pods could not be read: open /var/run/secrets/kubernetes.io/serviceaccount/ca.crt: no such file or directory,
set kubeconfig_file if auth server is running outside of the cluster

Or, instead of main cluster, you want to mean k8s cluster?

I’ve other doubts about this scenario as well (regarding the location of k8s sessions, the database location of the k8s teleport cluster), but let’s focus on this problem first.

I shared with you by e-mail all of my configuration files (hiding important info) and a topology to clarify our thoughts.

@galindro I understand your issue, other customers have faced this same thing. I made a forum post here (Enabling Teleport to act as a Kubernetes proxy for trusted/leaf clusters) about using a “dummy” kubeconfig file on the main cluster to achieve the result you want.

We are planning to resolve this in a future version - there is a Github issue tracking it here: https://github.com/gravitational/teleport/issues/3087

Hello @gus. I could not make it work, unfortunately. After the teleport cluster running within the k8s cluster could successfully join on the main cluster and I could login to it trough tsh. My kubeconfig was updated with the main cluster proxy endpoint. But, we I try to reach k8s cluster trough kubectl, I’m getting this error:

$ kubectl get pods
error: the server doesn't have a resource type "pods"

$ kubectl version
$ Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.0", GitCommit:"641856db18352033a0d96dbc99153fa3b27298e5", GitTreeState:"clean", BuildDate:"2019-03-25T15:53:57Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}
Error from server (Forbidden): unknown

I think that these errors could be related to the way that I added the teleport cluster running in k8s cluster to main teleport cluster. Maybe, with you help me with the bellow questions, it would work.

  1. Do I need to make the teleport cluster running within k8s cluster persistent? Currently, I’m running it in stateless mode. Which means that if the pod is killed, the service will try to add itself to the cluster again.
    1.1. If I need to turn it into a statefull teleport cluster, should I need to use the same DynamoDB tables from the main cluster or they would need to be completely separated?

  2. When my teleport cluster running in k8s cluster joins as a trusted cluster to the main cluster, it is named with its pods hostname, not by its cluster_name/nodename that I’ve configured in teleport.yaml. Where can I set this name? I tried to set it on trusted cluster resource file on metadata.name, but AFAIK from the docs, that property must have the main clusters name…

The proxy itself is stateless by definition but it will need to be joined up to an auth server which does have state. This would need to be running all the time with persistent storage, otherwise you would have to rejoin it back to the main cluster as a trusted cluster every time it restarted. I would recommend making the Teleport cluster a more permanent deployment in a separate namespace if necessary.

You should use separate DynamoDB tables for ease. It’s actually possible to use the same table (as each entry is keyed against the particular cluster it comes from with a UUID) but if you need to reset an auth server’s storage at any point it will be harder with one table.

It should be set according to what is configured for cluster_name in your teleport.yaml file on the auth server you’re using. If this isn’t happening then please post your configs for each part of each cluster and I’ll see whether I can replicate it.

It’s worth noting that if you change cluster_name at all you will need to delete the storage for that auth server and set it up again from scratch (which is one reason why it might be a good idea to use a separate DynamoDB table)

Ok @gus. I’ll first configure it to have a persistent state in DynamoDB. After that, I’ll make it join to the main cluster. If it still not getting the correct name, I’ll share to you the configurations and logs.

@gus. I could make it working with dynamodb and the problem that I told you regarding the cluster_name was my fault. It occurred because I changed the teleport image default command to the bellow one:

      command:
        - "/bin/bash"
      args:
        - "-c"
        - "teleport start -c /etc/teleport/teleport.yaml --diag-addr=0.0.0.0:3000 --roles=proxy,auth -d & PID=$! && sleep 5 && tctl create /etc/teleport/trusted_cluster.yaml && wait $PID"

I know that is better to use a sidecar container for it. This was only for testing proposes…

Now I finally have the correct cluster name and I’m facing the issue mentioned here: How to share kubernetes groups between trusted clusters. After login with tsh, if I try to execute kubectl on my k8s cluser, the main cluster auth is logging this:

INFO [RBAC]      Access to create user in namespace default denied to roles Proxy,default-implicit-role: no allow rule matched. services/role.go:1826

The teleport cluster that is running within k8s cluster is trowing some debug messages that I don’t know if are important…

DEBU [PROXY:AGE] Seeking: {Cluster:eu-west-1 Type:proxy Addr:{Addr:teleport-main-cluster.eu-west-1.mydomain:3024 AddrNetwork:tcp Path:}}. cluster:test-eks reversetunnel/agentpool.go:180
DEBU [PROXY:AGE] Adding agent(id=57,state=connecting) -> eu-west-1:teleport-main-cluster.eu-west-1.mydomain:3024. cluster:test-eks reversetunnel/agentpool.go:312
DEBU [PROXY:AGE] Changing state connecting -> connecting. id:57 target:teleport-main-cluster.eu-west-1.mydomain:3024 reversetunnel/agent.go:204
DEBU [HTTP:PROX] No valid environment variables found. proxy/proxy.go:217
DEBU [HTTP:PROX] No proxy set in environment, returning direct dialer. proxy/proxy.go:137
INFO [PROXY:AGE] Connected to 10.100.3.105:3024 id:57 target:teleport-main-cluster.eu-west-1.mydomain:3024 reversetunnel/agent.go:420
DEBU [PROXY:AGE] Agent connected to proxy: [7758fc50-f750-44bf-90f0-5c6483c4a7b7.eu-west-1 test-teleport-proxy.eu-west-1 test-teleport-proxy auth.eu-west-1.mydomain remote.kube.proxy.teleport.cluster.local ssh.eu-west-1.mydomain k8s.eu-west-1.mydomain]. id:57 target:teleport-main-cluster.eu-west-1.mydomain:3024 reversetunnel/agent.go:431
DEBU [PROXY:AGE] Changing state connecting -> connected. id:57 target:teleport-main-cluster.eu-west-1.mydomain:3024 reversetunnel/agent.go:213
DEBU [PROXY:AGE] Proxy already held by other agent: [7758fc50-f750-44bf-90f0-5c6483c4a7b7.eu-west-1 test-teleport-proxy.eu-west-1 test-teleport-proxy auth.eu-west-1.mydomain remote.kube.proxy.teleport.cluster.local ssh.eu-west-1.mydomain k8s.eu-west-1.mydomain], releasing. id:57 target:teleport-main-cluster.eu-west-1.mydomain:3024 reversetunnel/agent.go:462
DEBU [PROXY:AGE] Changing state connected -> disconnected. id:57 target:teleport-main-cluster.eu-west-1.mydomain:3024 reversetunnel/agent.go:213

Good to know you got it working :slight_smile:

This message is pretty much log spam and can be ignored.

This looks like you might have multiple proxy servers deployed or multiple entries in a config file somehow which are pointing to the same proxy server? Do you have things behind load balancers or exposed via a Kubernetes service somehow?

If this issue persists, it’d be helpful if you could share your /etc/teleport.yaml files from both clusters and also your trusted_cluster YAML file.

Also, as you mentioned in the other thread, you can’t add role objects in the OSS version of Teleport so hopefully you can get the Enterprise trial going and figure that part out. It might be worth doing that before we try to address this issue.