Some recent Gravity releases may experience an issue where a cluster may enter “degraded” state after removing a node with the following error mentioning the removed node:
overlay packet loss for node <node-ip> is higher than the allowed threshold of 20.00%: 100.00%
This will happen if the overlay checker detects a networking issue while the node is being removed. The warning will stay permanently even after the node has been removed.
The following Github ticket describes the issue in more detail: https://github.com/gravitational/gravity/issues/1403.
The following versions may experience this issue:
Recreating “nethealth” pods in the monitoring namespace will clear the bogus warnings:
kubectl -nmonitoring delete pods -lk8s-app=nethealth