Removing Offline Node

I have a node that shows up as offline when I run gravity status, which is true as the instance is no longer around. The node does not show up when I run kubectl get nodes. Running gravity remove (and gravity remove --force) for the node, however, fails and the node remains listed as part of my cluster.

Is there a way to see why the gravity remove (and gravity remove --force) commands for the node are failing? Or is there another way to remove an unrecoverable node?

Hi @aparcel,

What gravity version are you using?

So just to confirm you attempted to execute gravity remove <node-ip> --force and the node still appears in gravity status?

When you ran gravity remove did it launch any shrink operation? Are there any errors in the active gravity-site logs you can share?

kubectl -nkube-system logs gravity-site-xxx

Hi Abdu,

I have the same issue. Is there a way to manually removed a failed node from the cluster?

Thanks,
Daniel

The gravity version is 4.51.0

Yes, when executing gravity remove <node-ip> --force, the node still appears in gravity status.

I do see in gravity status that the operation_shrink was launched and failed.

I am unable to pull the logs using the command you cited.

User "kubectl" cannot get pods/log in the namespace "kube-system".

Just to confirm are you executing that command in gravity shell?

  1. sudo gravity shell
  2. kubectl -nkube-system get pods to get name of gravity-site-XXXXX
  3. execute kubectl -nkube-system logs gravity-site-xxx

Also have you explored upgrading your Gravity version? We recently announced Gravity 7.0.

Hi @dtufino,

Can you share your gravity version?

Have you already attempted to execute gravity remove <node-ip> --force from a master node? More details from the doc on that here.

If the node is still showing in gravity status can you share any error messages or logs you have captured?

Can you share the output from executing kubectl -nkube-system logs gravity-site-xxx in Gravity?