Logstash and Beats

The ELK stack ingests data through two primary paths: Beats for lightweight collection and Logstash for complex processing. Understanding both enables flexible, reliable log ingestion.

Beats Overview

Beats are lightweight data shippers designed for specific purposes:

Filebeat collects log files. Point it at log directories, and it forwards new entries to Elasticsearch or Logstash. Modules provide pre-built configurations for common sources—Apache, Nginx, MySQL, system logs.

Winlogbeat collects Windows event logs. Configure which event channels to monitor—Security, System, Application, or custom channels. Native Windows integration means no additional logging configuration.

Packetbeat captures network traffic. It understands protocols like HTTP, DNS, and MySQL, extracting meaningful fields rather than just raw packets.

Metricbeat collects system and service metrics. CPU usage, memory consumption, disk I/O—operational data that complements security logs.

Beats are lightweight enough to deploy everywhere. They buffer locally during network issues and resume forwarding when connectivity returns.

Logstash Pipelines

Logstash processes data through configurable pipelines with three stages:

Input brings data into the pipeline. Common inputs include beats (receiving from Beat agents), file (reading local files), syslog (receiving syslog messages), and http (receiving webhooks).

Filter transforms data. The grok filter parses unstructured text into fields using patterns. mutate renames, removes, or modifies fields. date parses timestamps. geoip adds geographic information for IP addresses.

Output sends processed data to destinations. Elasticsearch is most common, but outputs exist for many systems—other SIEMs, cloud storage, alerting systems.

Pipeline Example

A typical security pipeline might look like:

input {
beats { port => 5044 }
}
filter {
if [type] == "syslog" {
grok { match => { "message" => "%{SYSLOGTIMESTAMP:timestamp} %{SYSLOGHOST:host} %{DATA:program}: %{GREEDYDATA:message}" } }
}
date { match => [ "timestamp", "MMM dd HH:mm:ss" ] }
}
output {
elasticsearch { hosts => ["localhost:9200"] index => "logs-%{+YYYY.MM.dd}" }
}

Build pipelines incrementally. Start with input and output, then add filters one at a time, testing after each addition.

Pro Content

Logstash & Beats

Learning Objectives

Logstash and Beats

Beats Overview

Logstash Pipelines

Pipeline Example

Answer the Questions0 / 4 completed

Interactive Sandbox

Submit Flag