Gravity as Bare k8s Cluster

Hi,

I was wondering if it make sense to use Gravity to simply deploy an empty k8s cluster to begin working with? I’m not really after deploying “apps” or creating images that can be deployed, more an easy to start k8s cluster that has a dashboard along with metrics etc… This would save me having to configure a lot of additional services.

How does upgrading work? I’m assuming if I go ahead and manually deploy pods and services after the cluster is initialised that the upgrade is non-destructive and won’t re-roll the cluster as a new deployment?

As Gravity depends on the use of etcd, I was wondering if it can play nicely with lower powered nodes i.e. 2vcpu, 8GB and shared SSD? I’m currently running it on a test cluster and it does seem quite heavy on a low use cluster.

Finally, I assume it should be possible to add manifests to the cluster build to make deployment a bit easier i.e. traefik, postgresql-operator etc?

Thanks! And sorry for all the questions

I was wondering if it make sense to use Gravity to simply deploy an empty k8s cluster to begin working with?

Absolutely, while a lot of features are geared towards packaging applications, nothing stops you from using an empty cluster. We also publish an empty gravity cluster, so you don’t need to touch the build stuff. With tele OSS you should be able to run tele ls / tele ls --all and tele pull gravity:<version> to download an installer that’s effectively an empty app.

How does upgrading work?

We write all the automation around upgrading gravity as a platform, and expose application hooks for upgrading the deployed software. In an empty cluster, the hooks can be left empty and gravity will only upgrade the components that make up gravity itself: https://gravitational.com/gravity/docs/cluster/#updating-a-cluster

I’m assuming if I go ahead and manually deploy pods and services after the cluster is initialised that the upgrade is non-destructive and won’t re-roll the cluster as a new deployment?

Correct. The gravity cluster upgrade is an online rolling upgrade, so in theory the application should remain available throughout the upgrade. The exception to this is when upgrading etcd (which only happens if required), where our current method of etcd upgrades will see etcd and the kubernetes API go offline temporarily, but all pods and services should remain online. Some customers for various reasons do choose to scale down all deployments as part of their upgrade process, but that’s up to the customer.

As Gravity depends on the use of etcd, I was wondering if it can play nicely with lower powered nodes i.e. 2vcpu, 8GB and shared SSD?

This doesn’t often work well, etcd often struggles when the disk isn’t dedicated or there isn’t enough available iops. Doesn’t mean it absolutely won’t work, but it can often be a struggle, and isn’t something we really support. Our recommendations around etcd IO are available here: https://gravitational.com/gravity/docs/requirements/#etcd-disk

Recent releases of gravity will test the etcd disk IO. When sharing a disk though, even though the installation test may pass, other services interacting with the disk could cause IO latency within etcd.

Finally, I assume it should be possible to add manifests to the cluster build to make deployment a bit easier i.e. traefik, postgresql-operator etc?

Yep, even though it’s not a complete app in a sense, it is common to use the gravity concept of a cluster image to use as a base set of services to be available within the cluster. So these services can be installed at cluster install time, but only provide the base infrastructure for the cluster. In essence, you’re app (what we call a cluster image) is just gravity + base services with anything else deployed later.

Thanks for such a great response. It certainly helped to clear up a few things. The etcd is a bit awkward as my host does not allow additional block storage, however, the IOPS is reasonable 67k read and 12.1k write, so I might try and boost the IO priority of the etcd processes and see how it goes.

Out of interest can the etcd cluster be hosted externally from the k8s nodes? I couldn’t see anything obvious in the docs. I’ll definitely have a play around and see how I get on with some deployments.

Thanks again!

I might try and boost the IO priority of the etcd processes and see how it goes.

Please let us know how it goes. I spent a bunch of time on this last year, specifically when testing on cloud environments. The problem I ran into, is IIRC the linux IO scheduler was built for spinning rust, and is still being redesigned / re-implemented for SSD/NVME storage speeds, and as such, the IO scheduler is totally disabled on the cloud environments I tested. As such I was never able to roll anything out. But if you do find something that works I’d definitely be interested to see what that looks like.

Out of interest can the etcd cluster be hosted externally from the k8s nodes?

Unfortunately this isn’t something we’ve built support for, as we don’t have any customers running in this model. We’ve floated around some features / ideas similar or related to this, for colocating the control plane for multiple clusters onto a single kubernetes cluster, but so far we haven’t seen much interest.