Tuesday, January 31, 2017

Notes on setting up Elasticsearch, Kibana and Fluentd on Ubuntu

I've been experimenting with an EFK stack (with Fluentd replacing Logstash) and I hasten to write down some of my notes. I could have just as well used Logstash, but my goal is to also use the EFK stack for capturing logs out of Kubernetes clusters, and I wanted to become familiar with Fluentd, which is a Cloud Native Computing Foundation project.

1) Install Java 8

On Ubuntu 16.04:

# apt-get install openjdk-8-jre-headless

On Ubuntu 14.04:

# add-apt-repository -y ppa:webupd8team/java
# apt-get update
# apt-get -y install oracle-java8-installer

2) Download and install Elasticsearch (latest version is 5.1.2 currently)

# wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.1.2.deb
# dpkg -i elasticsearch-5.1.2.deb


Edit /etc/default/elasticsearch/elasticsearch.yml and set

network.host: 0.0.0.0

# service elasticsearch restart

3) Download and install Kibana

# wget https://artifacts.elastic.co/downloads/kibana/kibana-5.1.2-amd64.deb
# dpkg -i kibana-5.1.2-amd64.deb


Edit /etc/kibana/kibana.yml and set

server.host: "local_ip_address"


# service kibana restart

4) Install Fluentd agent (td-agent)

On Ubuntu 16.04:

# curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-xenial-td-agent2.sh | sh

On Ubuntu 14.04:

# curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-trusty-td-agent2.sh | sh


Install Fluentd elasticsearch plugin (note that td-agent comes with its own gem installer):

# td-agent-gem install fluent-plugin-elasticsearch

5) Configure Fluentd agent

To specify the Elasticsearch server to send the local logs to, use a match stanza in /etc/td-agent/td-agent.conf:

<match **>
  @type elasticsearch
  logstash_format true
  host IP_ADDRESS_OF_ELASTICSEARCH_SERVER
  port 9200
  index_name fluentd
  type_name fluentd.project.stage.web01
</match>

Note that Fluentd is backwards compatible with logstash, so if you set logstash_format true, Elasticsearch will create an index called logstash-*. Also, port 9200 needs to be open from the client to the Elasticsearch server.

I found it useful to set the type_name property to a name specific to the client running the Fluentd agent. For example, if you have several projects/tenants, each with multiple environments (dev, stage, prod) and each environment with multiple servers, you could use something like type_name fluentd.project.stage.web01. This label will then be parsed and shown in Kibana and will allow you to easily tell the source of a given log entry.

If you want Fluentd to parse Apache logs and send the log entries to Elasticsearch, use stanzas of this form in td-agent.conf:

<source>
  type tail
  format apache2
  path /var/log/apache2/mysite.com-access.log
  pos_file /var/log/td-agent/mysite.com-access.pos
  tag apache.access
</source>

<source>
  type tail
  format apache2
  path /var/log/apache2/mysite.com-ssl-access.log
  pos_file /var/log/td-agent/mysite.com-ssl-access.pos
  tag apache.ssl.access
</source>

For syslog logs, use:

<source>
  @type syslog
  port 5140
  bind 0.0.0.0
  tag system.local
</source>

Restart td-agent:

# service td-agent restart

Inspect the td-agent log file:

# tail -f /var/log/td-agent/td-agent.log

Some things I've had to do to fix errors emitted by td-agent:
  • change permissions on apache log directory and log files so they are readable by user td-agent
  • make sure port 9200 is open from the client to the Elasticsearch server

That's it in a nutshell. In the next installment, I'll show how to secure the communication between the Fluentd agent and the Elasticsearch server.

No comments: