Picture of a lake in Canada

Getting started with logstash

October 24, 2012 , posted under logstash grok log management iis php error log php

Logstash is a tool for doing fancy stuff with log files. Its main purpose is to help administrators/IT staff with the monitoring of logs, specifically it can be used to push logs to a central location where the logs are stored and indexed for later searching. A basic logstash setup (like the one I am implementing at work) consists of log shippers (web servers and database servers) and then one server where the logs are stored, indexed and searched (in my case this is a Ubuntu server).

The webservers contain lots of logs for various websites so it would be nice to have all of these stored in a central location. Which would be ideal for checking up on how the server is operating, if any 404/500s are happening etc. The logstash agent runs on these servers and monitors any log files that you specify for changes. When a change is detected log stash runs through any filter operations specified in the configuration file (more on this later) and then sends it out to the plugins specified in the output section of the config. Currenty the servers are set to output to the screen via command line and to be pushed to an instance of Redis the Key:Value store. Elastic search is running on another server and is the final storage location for the logs.

Once the log entry has arrived at Redis another logstash agent grabs the entry out of the database runs any specified processing and then sends the processed log event off to an instance of elastic search, where some more magic takes place to index and store the logs. Once the logs reach elastic search they are then accessible from a web interface where they may be searched.

Here are a few hopefuly useful settings for logstash:

IIS Grok Filter: 

%{DATESTAMP:eventtime} %{WORD:site} %{IPORHOST:hostip} %{WORD:method} %{URIPATH:request} (?:%{DATA:param}|-) %{NUMBER:port} (?:%{USER:username}|(%{WORD:domain}\\%{USER:username})|-) %{IPORHOST:clientip}(?: %{DATA:agent}|-) %{NUMBER:response} %{NUMBER:status} %{NUMBER:win32Status}

IIS Date Format String:

yyyy-MM-dd HH:mm:ss

PHP Error Log Grok Filter:

\[%{DATA:eventtime} UTC\] (?:%{GREEDYDATA:error})

PHP Date Format String

yyyy-MM-dd HH:mm:ss


Getting a final set up for logstash running on a single server was not a very straight forward process, as someone with limited linux experience it certainly had its challenging moments.

If you write a startup script in which a process gets daemonised or takes over the foreground then you can actually prevent the machine from starting. I discovered this the hard way when I wrote an init script in /etc/init only to find that my machine did not boot.

A handy little tip is to understand what the & character does when run from a shell/script, the purpose of this is to run the command in the background, this is **not **something you want to occur in your upstart script as upstart will not keep track of deamonised process. (This could lead to launching many instances of the program if the script is set to respawn).

To see what was running/working on my machine I found myself using the following commands whist testing scripts and the server configuration:

Find processes containing the word “elastic”

ps aux | grep -i elastic

Find out what programs are listening on what ports

sudo netstat -napt | grep -i LISTEN


I had a couple of issues with elasticsearch, after a while of collecting logs things suddenly started breaking with the search functionaility, after digging into the logs I found that elasticsearch was failing due to not being able to open enough files so I followed the steps outlined in this tutorial http://www.elasticsearch.org/tutorials/2011/04/06/too-many-open-files.html and the issue seems to have been resolved.

However after this all was not well, the cluster state was red and queries through Kibana were failing.

ConEmu running the Httpie tool.

I presume this was due to the failure to open new/enough files for the days index (Logstash rotates the index each day) so to fix this I deleted the day’s index and then the problem was resolved. (Thankfully we are just testing out logstash at the moment so it did not matter that we lost some events).

I will probably do another blog post soon on my graphite and statsd configuration. Let me know if you think I have missed anything!

comments powered by Disqus