Fluentd is an open-source data collector written in Ruby (with a C core for performance-critical paths) that unifies log collection, processing, and forwarding through a plugin-based pipeline. It was created at Treasure Data, donated to the CNCF in 2016, and graduated in 2019.
Configuration is a set of directives — <source>, <filter>, <match> — that define where events come from, how they’re transformed, and where they go. Events are tagged, and tags drive routing through match directives. The plugin ecosystem is huge: over a thousand input/output/filter plugins covering everything from tailing files and syslog to shipping into Elasticsearch, S3, Kafka, BigQuery, Splunk, and whatever obscure SIEM you have. Buffering is built in — memory or file-backed — with configurable retry, chunking, and backpressure semantics, which is what you actually want when your downstream sink gets slow at 3am.
In modern Kubernetes deployments, Fluentd is usually paired with or replaced by Fluent Bit: Fluent Bit runs on every node as the lightweight collector, and Fluentd (if still used) runs as a heavier aggregator tier that does complex parsing and routing before shipping to storage. The Ruby runtime is the main cost — Fluent Bit is an order of magnitude lighter per instance — but Fluentd’s plugin coverage and mature buffering still make it the right tool when you need those specific outputs or complex filter logic.