I’m installing Gravity (6.1+) into AWS, and I’m wondering how DNS resolution for the hostname works in Planet. In my configuration, I had edited the /etc/hosts file to include a line which helps resolve the hostname of the VM (i.e. ip-10-1-1-1.example.com) to an IP. It looks like kubelet is run with the flag --hostname-override=<hostname>.
Basically this hostname is not resolvable unless we put an entry into the /etc/hosts. My question is how will Planet know how to resolve this hostname now that calls for things like helm use this hostname instead of the IP? Manually sticking the same edited line resolving the hostname to an IP within Planet’s /etc/hosts file works, but how is this normally done when installing?
The error I see is that my install hooks all fail when attempting to use Helm to install any charts located within the tarball because Planet cannot resolve this hostname.
I’m not sure I fully understand you’re problem statement, as it appears to mix several separate items.
Kubelet / --hostname-override
We override the hostname kubernetes self reports as to the cluster, to be consistent with our x509 implementation, and ideally not run into issues with host names being changed which invalidate the signed x509 certificates used by kubelet. This does mean when using the on-prem mode that the node names will be the ip addresses and not the hostname. When AWS integrations are turned on, kubelet ignores the hostname override flag, and will only use the hostname as per the AWS metadata API, and cannot be changed.
Helm relation to planet hosts file
Sorry, I’m not sure I understand / see the connection between helm and the planet hosts file. From what I remember you’re invoking helm provided by the gravity cluster within planet, but should that not invoke tiller via the kubernetes API. Sorry, I’m probably missing something, but I don’t know about the connection between helm and the hosts file.
Standard way to place additional names into planet name resolution
I believe the standard way to address this, is to have the upstream DNS resolvers able to correctly resolve external names, so all hosts within a cluster and all pods have the same access when requesting additional DNS records.
Unofficial way to insert additional DNS records
The installer has a hidden command to insert additional DNS host records into the local configured cluster DNS (–dns-host). As this is a hidden command, it is likely not routinely tested or guaranteed to work.
I must be conflating some ideas, my apologies. To be clear, in Gravity on AWS, when I issue commands such as helm list or helm install, things break and I’m trying to figure out a solution. The hostname of the EC2 instance is something like ip-10-x-x-x.example.com which itself isn’t resolvable. Because Helm needs to talk to the API server in order to call to tiller (my understanding) it breaks because the call is to something like https://ip-10-x-x-x.example.com/api/foo/bar which requires hostname resolution which cannot be resolved. Currently with K8s installations that are non-gravity, we get around this by editing the /etc/hosts file and adding the hostname and IP, thus allowing resolution of the EC2 instance name when needed.
Does this make more sense? Is there a DNS requirement for gravity in addition to what standard EC2 offers? Are the hostnames EC2 auto-creates based on the IP supposed to be resolvable? I’m not 100% familiar yet with how AWS and DNS work.
It does make more sense, although I’m still confused on why helm is resolving the name of the node to reach the API server.
The kubeconfig files we generate directly within planet, should be pointing towards the kubernetes cluster-ip by default (10.100.0.1 when using the default service subnet). Additionally we use leader.telekube.local within our kubernetes services.
Both the kubernetes service IP and leader.telekube.local are automatically updated to point to the elected leader of a cluster within the planet leader election (when running multiple masters).
I don’t personally use helm that much, but from what I remember, it should pick up the kubeconfig file to contact the API server, and then use the kubernetes port-forward/tunneling feature to contact tiller.
This is the same way say kubectl is always able to reach the API server, regardless of which master node is running the api.
Sorry, but I must be missing something, my confusion stems from I don’t know why the name of the host is important in this scenario at all, or why it’s involved in reaching the apiserver. Is there something you’re doing to indicate to helm that the API server should be the node name, and not the cluster-ip?
I’ve noticed that the search domain in the kubelet config is not modified, and is set to cluster.local. So, if I update the zone name in the coredns configmap from cluster.local to something like mydomain.com, the search path in the pod will not be correct in that if a pod needs to search for <svc_name>.<ns>.svc.mydomain.com it will not work. It only has namespace within <ns>.svc.cluster.local from kubelet as one that would match that pattern.
Leaving the zone to cluster.local in the coredns configmap makes this work, but I was wondering how you would update Kubelet config to match your nodes domain if needed?
but I was wondering how you would update Kubelet config to match your nodes domain if needed?
This isn’t really a supported use case, I guess I’m not sure under what circumstances you would want to do this. Part of the idea with gravity is building multiple identical clusters, which includes having the services within the cluster using identical naming scheme when interacting within the cluster.
I would also argue that it’s not really correct to have the DNS zones within the cluster use the same DNS zones as systems or services outside the cluster. The problem really comes to if the DNS service inside the cluster is authoritative for say example.com, but there are authoritative DNS outside the cluster with records for example.com, these two DNS zones will not be in sync with each other and will interfere with each other. DNS outside the cluster won’t see inside the cluster, unless opened and delegated to in some way. And DNS inside the cluster won’t be able to reach outside the cluster for any records under the domain.
So I’d be curious to understand the use case better, but at a surface look I don’t think I would recommend this.
If you are referring to the AWS ec2 instance private DNS names ip-x-x-x-x…compute.internal then it should be resolvable from any instance/node in that VPC.
I’m not sure what the default coredns configuration is, but it can be configured to forward non-k8s specific queries to an upstream DNS server. If you are running the queries within gravity/planet or within a pod, it should be able to resolve.
Below is a copy paste from the k8s documentation. You can update the forward value.
Additionally another copy paste from the k8s documentation.
If a Pod’s dnsPolicy is set to “default”, it inherits the name resolution configuration from the node that the Pod runs on. The Pod’s DNS resolution should behave the same as the node. But see Known issues.
Thanks Kevin. This stems from an older installation practice where we have clusters which are unique instead of the same. I think it originates from using alternative K8s installation mechanism and different thinking about how to best manage a production cluster. I asked this question here to see if there was anything we can do to make Gravity apple to apples with that setup practice.
Taking a deeper look into this, there isn’t really a reason why we shouldn’t be using Cluster.local as DNS. The only thing I could come up with if if multiple clusters are forwarding logs to a collector, would the logs all look like they came from the same cluster (based on the DNS entry) or not?
I’ve left the coredns configmap with cluster.local which makes sense. All pods are set to “ClusterFirst” dnsPolicy with resolvers for things outside the domain of the cluster. With this, pod short DNS name resolution functions fine.
Thanks @flarierza. I found this info yesterday and it came in very handy. I think for now we’ll just stick to using cluster.local for the pods and have resolvers for things in other domain. Fairly vanilla.