Gravity Cluster with Different VPC/Network of Node Worker

Is there Possible to Join Gravity with different Network / VPC of Node Worker ?

Is like :

  • We have Gravity master created in AWS With VPC
  • We want to join node-worker in on-prem vm to our gravity master? How do we do this?
  • We want to join node-worker of another VPC ? How do we do this?

Yes, as long as the overall network connectivity between all nodes within the cluster match what’s listed here https://gravitational.com/gravity/docs/requirements/#network you should be able to get setup.

However, there are many considerations that come into play here, that may make spreading out a cluster like this less than ideal.

  • If you’re running over the internet and not VPNs, traffic between pods can be injected, intercepted, manipulated, etc on the overlay network. We have a project to help with that: https://github.com/gravitational/wormhole
  • If you’re master servers are too far from each other latency wise, you could see performance problems with the cluster. ETCD needs to keep each node in sync, and to my knowledge performance falls drastically as latency is introduced. So masters should be within availability zones that are close to each other.
  • Services will be spread out, and inheret a latency cost as well. A worker node joined on-prem, will serve DNS to software running on the cloud, and vice versa. This latency can have an effect on the deployed applications, even if it’s just slow DNS. I believe there are some working groups working on topology awareness, but I’m not sure any of those features are available yet.
  • How kubernetes reacts to a dead link may not always be intuitive and has changed over time, and may require some configuration. If you’re on-prem host goes offline due to a network issue, that doesn’t mean the software it’s running will necessarily be deployed elsewhere. When a node becomes unreachable in kubernetes, kubernetes doesn’t know if that’s because the node crashed and went offline, or if it’s just a networking issue and the software is still running. Now, IIRC there are some changes and methods to get a specific behaviour here, but it’s a sort of deeper discussion than can be covered here.