Metrics not available for pod

Hi,

I’ve rolled out a couple of test deployments using Gravity and the quickstart repository, but I’m having issues with viewing some of the pod metrics. for example:

textgravity@galaxy-node1:~$ kubectl top node
NAME            CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
192.168.2.127   609m         30%    2086Mi          53%       

gravity@galaxy-node1:~$ kubectl top pod
W1024 19:17:59.193365   19818 top_pod.go:266] Metrics not available for pod default/mattermost-database-5fl5b, age: 2h33m1.193353896s
error: Metrics not available for pod default/mattermost-database-5fl5b, age: 2h33m1.193353896s

Is this expected behaviour? I had a look at the logs for the kube-state-metrics pod and fixed a couple of errors relating to clusterRoles but it hasn’t solved the problem.

Also is 30% CPU usage normal for an a quickstart cluster?
Seems a little high. I’m running 2 cores of an Intel Xeon L5640 @ 2.27GHz overclocked to 3GHz

Hi @ctrl-linux-delete,

That is not expected behavior - you should be able to execute kubectl top pod.

Which versions of k8s are you on? Can you also provide the output from kubectl logs?

Regarding CPU that potentially may be normal. There are various election processes that happen + certain controllers have more work to do which may be causing this on launch.

Thanks

Okay no worries!
Just wanted to check as I noted the usage and load averages were higher than I expected.

NAME              STATUS   ROLES    AGE     VERSION   INTERNAL-IP       EXTERNAL-IP   OS-IMAGE                       KERNEL-VERSION      CONTAINER-RUNTIME
xxx.xxx.xxx.xxx   Ready    <none>   3h22m   v1.16.2   xxx.xxx.xxx.xxx   <none>        Debian GNU/Linux 9 (stretch)   4.15.0-66-generic   docker://18.9.5
Mon Oct 28 16:01:06 UTC	Connecting to agent
installer is running from a temporary directory.
It is recommended to run the installer from a non-volatile location to support operation resumption after a node is rebooted.Mon Oct 28 16:01:07 UTC	Connected to agent
Mon Oct 28 16:01:07 UTC	Connecting to cluster
Mon Oct 28 16:01:07 UTC	Connected to installer at https://xxx.xxx.xx.xxx:61009
Mon Oct 28 16:01:08 UTC	Operation has been created
Mon Oct 28 16:03:15 UTC	All servers are up
Mon Oct 28 16:03:26 UTC	Configure packages for all nodes
Mon Oct 28 16:03:30 UTC	Bootstrap master node master1
Mon Oct 28 16:03:36 UTC	Pull packages on master node master1
Mon Oct 28 16:04:31 UTC	Install system software on master node master1
Mon Oct 28 16:04:32 UTC	Install system package teleport:3.2.7 on master node master1
Mon Oct 28 16:04:37 UTC	Install system package planet:6.2.2-11602 on master node master1
Mon Oct 28 16:05:00 UTC	Wait for Kubernetes to become available
Mon Oct 28 16:05:20 UTC	Bootstrap Kubernetes roles and PSPs
Mon Oct 28 16:05:24 UTC	Configure CoreDNS
Mon Oct 28 16:05:27 UTC	Create system Kubernetes resources
Mon Oct 28 16:05:29 UTC	Export applications layers to Docker registries
Mon Oct 28 16:05:30 UTC	Populate Docker registry on master node master1
Mon Oct 28 16:06:13 UTC	Wait for cluster to pass health checks
Mon Oct 28 16:06:16 UTC	Install system application dns-app:0.3.0
Mon Oct 28 16:06:33 UTC	Install system application logging-app:6.0.2
Mon Oct 28 16:06:42 UTC	Install system application monitoring-app:6.0.4
Mon Oct 28 16:07:10 UTC	Install system application tiller-app:6.0.0
Mon Oct 28 16:07:36 UTC	Install system application site:6.2.2
Mon Oct 28 16:09:28 UTC	Install system application kubernetes:6.2.2
Mon Oct 28 16:09:31 UTC	Install application mattermost:2.2.0
Mon Oct 28 16:09:52 UTC	Connect to installer
Mon Oct 28 16:10:02 UTC	Enable cluster leader elections
Mon Oct 28 16:10:13 UTC	Operation has completed

I’ve checked numerous logs and not been able to find anything resembling an error message, only references to being unable to fetch pod metrics

root@master1:~# kubectl -n monitoring logs replicaset.apps/prometheus-adapter-54c847cb58
I1028 16:07:29.500781       1 adapter.go:91] successfully using in-cluster auth
I1028 16:07:30.502906       1 serving.go:273] Generated self-signed cert (/var/run/serving-cert/apiserver.crt, /var/run/serving-cert/apiserver.key)
I1028 16:07:31.188633       1 serve.go:96] Serving securely on [::]:6443
E1028 16:11:02.338588       1 provider.go:186] unable to fetch CPU metrics for pod kube-system/lr-aggregator-5f46596ffb-2x95q, skipping
E1028 16:11:02.338637       1 provider.go:186] unable to fetch CPU metrics for pod kube-system/dns-app-install-327618-pmxdh, skipping
E1028 16:11:02.338645       1 provider.go:186] unable to fetch CPU metrics for pod kube-system/site-app-post-install-fd103d-xwwvr, skipping
E1028 16:11:02.338652       1 provider.go:186] unable to fetch CPU metrics for pod kube-system/logging-app-install-1a97e8-9dz6r, skipping
E1028 16:11:02.338659       1 provider.go:186] unable to fetch CPU metrics for pod kube-system/monitoring-app-install-bb7a7c-4g6mp, skipping
E1028 16:11:02.338665       1 provider.go:186] unable to fetch CPU metrics for pod kube-system/log-collector-6d56577848-vtg5c, skipping
E1028 16:11:02.338673       1 provider.go:186] unable to fetch CPU metrics for pod kube-system/install-acd1c7-c4njz, skipping
E1028 16:11:02.338679       1 provider.go:186] unable to fetch CPU metrics for pod kube-system/coredns-dz6z7, skipping
E1028 16:11:02.338687       1 provider.go:186] unable to fetch CPU metrics for pod kube-system/gravity-install-f35812-dxblm, skipping
E1028 16:11:02.338693       1 provider.go:186] unable to fetch CPU metrics for pod kube-system/gravity-site-q5hnw, skipping
E1028 16:11:02.338705       1 provider.go:186] unable to fetch CPU metrics for pod kube-system/lr-forwarder-684b5f84dc-5jh4z, skipping
E1028 16:11:02.338711       1 provider.go:186] unable to fetch CPU metrics for pod kube-system/tiller-app-bootstrap-9f563a-tjg4c, skipping
E1028 16:11:02.338718       1 provider.go:186] unable to fetch CPU metrics for pod kube-system/lr-collector-l6fqt, skipping
E1028 16:11:02.338724       1 provider.go:186] unable to fetch CPU metrics for pod kube-system/tiller-deploy-5d8bc64ffd-v9sjz, skipping
E1028 16:11:02.338733       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/lr-aggregator-5f46596ffb-2x95q: no metrics known for pod
E1028 16:11:02.338740       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/dns-app-install-327618-pmxdh: no metrics known for pod
E1028 16:11:02.338746       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/site-app-post-install-fd103d-xwwvr: no metrics known for pod
E1028 16:11:02.338751       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/logging-app-install-1a97e8-9dz6r: no metrics known for pod
E1028 16:11:02.338760       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/monitoring-app-install-bb7a7c-4g6mp: no metrics known for pod
E1028 16:11:02.338765       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/log-collector-6d56577848-vtg5c: no metrics known for pod
E1028 16:11:02.338770       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/install-acd1c7-c4njz: no metrics known for pod
E1028 16:11:02.338775       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/coredns-dz6z7: no metrics known for pod
E1028 16:11:02.338779       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/gravity-install-f35812-dxblm: no metrics known for pod
E1028 16:11:02.338784       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/gravity-site-q5hnw: no metrics known for pod
E1028 16:11:02.338788       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/lr-forwarder-684b5f84dc-5jh4z: no metrics known for pod
E1028 16:11:02.338793       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/tiller-app-bootstrap-9f563a-tjg4c: no metrics known for pod
E1028 16:11:02.338798       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/lr-collector-l6fqt: no metrics known for pod
E1028 16:11:02.338802       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/tiller-deploy-5d8bc64ffd-v9sjz: no metrics known for pod
E1028 16:15:19.902527       1 provider.go:186] unable to fetch CPU metrics for pod kube-system/gravity-site-q5hnw, skipping
E1028 16:15:19.902606       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/gravity-site-q5hnw: no metrics known for pod
E1028 16:15:19.902629       1 reststorage.go:121] unable to fetch pod metrics for pod kube-system/gravity-site-q5hnw: no metrics known for pod "kube-system/gravity-site-q5hnw"
E1028 16:15:32.454690       1 provider.go:186] unable to fetch CPU metrics for pod kube-system/gravity-site-q5hnw, skipping
E1028 16:15:32.454727       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/gravity-site-q5hnw: no metrics known for pod
E1028 16:15:32.454739       1 reststorage.go:121] unable to fetch pod metrics for pod kube-system/gravity-site-q5hnw: no metrics known for pod "kube-system/gravity-site-q5hnw"
E1028 16:16:03.246995       1 provider.go:186] unable to fetch CPU metrics for pod kube-system/tiller-deploy-5d8bc64ffd-v9sjz, skipping
E1028 16:16:03.247045       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/tiller-deploy-5d8bc64ffd-v9sjz: no metrics known for pod
E1028 16:16:03.247057       1 reststorage.go:121] unable to fetch pod metrics for pod kube-system/tiller-deploy-5d8bc64ffd-v9sjz: no metrics known for pod "kube-system/tiller-deploy-5d8bc64ffd-v9sjz"
E1028 16:49:41.689149       1 provider.go:186] unable to fetch CPU metrics for pod default/mattermost-database-kmxg9, skipping
E1028 16:49:41.689214       1 provider.go:186] unable to fetch CPU metrics for pod default/mattermost-worker-5f5c45c6f7-hk75f, skipping
E1028 16:49:41.689223       1 provider.go:186] unable to fetch CPU metrics for pod default/mattermost-worker-5f5c45c6f7-lt2kj, skipping
E1028 16:49:41.689242       1 reststorage.go:144] unable to fetch pod metrics for pod default/mattermost-database-kmxg9: no metrics known for pod
E1028 16:49:41.689252       1 reststorage.go:144] unable to fetch pod metrics for pod default/mattermost-worker-5f5c45c6f7-hk75f: no metrics known for pod
E1028 16:49:41.689262       1 reststorage.go:144] unable to fetch pod metrics for pod default/mattermost-worker-5f5c45c6f7-lt2kj: no metrics known for pod

I’ve been able to replicate the error on two different VMs easily using the quickstart repo. I’ve also got a bare cluster app, which doesn’t include the mattermost app and that has the same issue. There’s some talk about tls authentication issues on the web that result in this error, but there’s nothing in the logs to suggest that, assuming I’m looking in the right place.

Let me know if you need me to check anything else or logs for particular pods.
Thanks

Hi @abdu,

Any ideas on this? Do you need any more info from me?

Hi @ctrl-linux-delete,

Also is 30% CPU usage normal for an a quickstart cluster?

It shows you amount of used memory from resource limits/requests from Pods/Deployments/StatefulSets. It is NOT real usage of your system memory. Real usage on server host you can check with top, free and etc commands.

I don’t have 6.0+ clusters offhand to check what is wrong there. I am going to spin up a new cluster with gravity 6.2.2 and return to you with my findings.

@ctrl-linux-delete and @abdu I’ve created the issue in github to track status of the problem.

2 Likes

Yes, I was using htop and iotop for monitoring CPU loads. On my idle cluster with 2 cores, it was generating around 0.60 load, which seemed a little high, which is why I asked. I got the 30% figure by dividing the load average by the number of cores.

If it’s the expected amount of load, then I’m happy and there’s no issue.