Log forwarding in Gravity

Hello team,

I am new to gravity, and would like to understand how logging and log forwarding works.

I can see you can configure the gravity cluster to forward logs to a remote logging server as per:

https://gravitational.com/gravity/docs/config/#log-forwarders

Can someone help me understand:

  1. I can see there is a log-forwarder pod deployed on each node (controllers and workers).
  2. Are the logs on my pods, on a particular node, aggregated and sent to the log-forwarder pod?
  3. Is the log-forwarder pod then forwarding these logs somewhere else or it stays there?
  4. Assuming the logs stay on the node’s local log-forwarder pods, does that mean when I log onto Ops Center and view the logs, it is pulling this information from the log-forwarder pods on all the other nodes?
  5. If not, is it pulling logs from all the pods/containers directly? I don’t see much point of the log-forwarder pod if this is true, so I’m assuming this doesn’t happen.
  6. As for forwarding logs to a remote server, is the Ops Center aggregating the logs and sending them off? Or all the log-forwarder pods are sending this off to the remote server? This is important because the firewall rules will need to either allow 1 IP or multiple IP addresses for all the nodes.

I hope my question makes sense.

Thanks!

Hello @flarierza!

We’re using Logrange for our logging facilities. Logs from all containers are collected by the “collector” pods runnings on all nodes as a part of a daemon set and forwarded to the “aggregator” so by default the logs never leave the cluster and accumulate on the node the aggregator is running on. When you’re viewing the logs in the dashboard, it retrieves them via API from another service which retrieves them from the logrange aggregator.

For the forwarding, it is done by the “forwarder” pod which is currently a deployment with 1 replica. It is configured to be scheduled on master nodes so I would whitelist all your masters in the one the forwarder is running on crashes and the pod gets rescheduled.

Hope this helps!

Thanks,
Roman

1 Like

Hello @r0mant !

Thanks for your reply! That definitely helps and explains the overall solution of how logging works. However, I do want to clarify some details:

  • Double checking the forwarder and collector names are correct, and not the other way around? I’m checking on my configuration and it has a forwarder pod on every node, but the collector pod is on the master only. So it sounds like what you have explained, except the names are flipped. Additionally, after I configure the remote logging server with gravity resource create <log-forwarder.yaml> I can see the configuration appear in my collector pod. So for some reason the names are flipped in my configuration. But that’s not important.

  • Using your terminology, the collector pod will collect local pod logs, and forward it to the logrange aggregator. Can you elaborate on what is the aggregator? Is it a service running in the planet container on the master node? So if I had 2 worker nodes and 1 master node, the 2 collector pods on worker1 and worker2 would forward logs to the logrange aggregator service on the master node?

  • For the forwarding to a remote logging server, you say the forwarder pod running on the master sends it. But where does the forwarder pod get the logs? Does the forwarder pod pull the logs from the logrange aggregator service and then send it to the remote logging server?

Thanks for your help!

We’ve actually migrated to Logrange-based logging stack only recently (starting from 6.0 IIRC), before that we were using rsyslog-based logging app, so that might be the reason for the confusion. Which gravity version are you currently on?

Hey,

Gravity version is 5.5.12.
So I guess it’s on the old system.

I see 1 log-collector pod on the master only and 1 log-forwarder pod on each node.

Hi @flarierza! Sorry this keeps slipping through the cracks.

Yeah, logrange support was added in 6.x so the version you’re using still has the rsyslog-based collector/forwarder we used before. In this case, you’re correct that the terminology is sort of reversed: the “collector” is the one receiving logs from “forwarders” running on all nodes.

Let me know if you have other questions.

Thanks,
Roman