Visualization of accesslogs through MySQL in Kibana/Elasticsearch

Long title for a complex topic. Target of this post is to show only a cut-through for an interesting scenario. Imagine an environment with several Tomcat instances. Those Tomcats are logging currently their catalina- and access-logs to file from where they are collected and shipped with a shell script to a central logserver, where they are consumed, aggregated and analyzed by another shell script. File based logging is not that efficient when the environment grows and more and more logfiles have to be processed by never updated shell scripts. Therefor many projects and companies switch to databased storing (e.g. MySQL) of log information in order to enable extensive ways of accessing the information for further processing. In this example we go a step further and aim to visualize the information. Kibana is a quite nice technology that provides easy ways to present manifold structured portions of data. Kibana works best together with Elasticsearch. And here comes the center of this scenario: How do we get the valuable log information from a relational database like MySQL to index based Elasticsearch cluster?

This is the setup:
overview

The solution consists of five steps:

  1. Installing MySQL with databases, tables, users, privileges
  2. Installing Tomcat with changes for MySQL-logging
  3. Installing Elasticsearch with standard configuration
  4. Installing ES JDBC bridge
  5. Installing Kibana for visualization

Versions

component version download
Java Hotspot 1.8.0_72 http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
MySQL 5.5 https://www.mysql.de/downloads/
Apache Tomcat 8.0.30 https://tomcat.apache.org/download-80.cgi
Elasticsearch 2.1.1 https://www.elastic.co/downloads/elasticsearch
Elasticsearch JDBC 2.1.1.2 https://github.com/jprante/elasticsearch-jdbc
Kibana 4.3.1 https://www.elastic.co/downloads/kibana

MySQL

First of all, the required setup for MySQL and all the relational storage stuff:

$ mysql -u root -p
mysql> create database tomcatlogs;
mysql> use tomcatlogs;
mysql> source create_accesslogs_table.sql;
mysql> source create_catalinalogs_table.sql;
mysql> show tables;
+----------------------+
| Tables_in_tomcatlogs |
+----------------------+
| accesslogs           |
| catalinalogs         |
+----------------------+
mysql> create user 'tomcat'@'localhost' identified by 'tomcat123';
mysql> grant all privileges on accesslogs to 'tomcat'@'localhost' with grant option;
mysql> grant all privileges on catalinalogs to 'tomcat'@'localhost' with grant option;

The two sql files have following content:

CREATE TABLE accesslogs (
	id INT UNSIGNED AUTO_INCREMENT NOT NULL,
	remoteHost CHAR(15) NOT NULL,
	userName CHAR(15),
	timestamp TIMESTAMP NOT NULL,
	virtualHost VARCHAR(64) NOT NULL,
	method VARCHAR(8) NOT NULL,
	query VARCHAR(255) NOT NULL,
	status SMALLINT UNSIGNED NOT NULL,
	bytes INT UNSIGNED NOT NULL,
	referer VARCHAR(128),
	userAgent VARCHAR(128),
	PRIMARY KEY (id),
	INDEX (timestamp),
	INDEX (remoteHost),
	INDEX (virtualHost),
	INDEX (query),
	INDEX (userAgent)
);
CREATE TABLE catalinalogs (
	DATED varchar(50) DEFAULT NULL,
	PRIORITY varchar(200) NOT NULL,
	CATEGORY varchar(200) NOT NULL,	
	MESSAGE varchar(5000) NOT NULL
);

Tomcat

The next part contains all configuration steps for pushing the logs of Tomcat into the two just created MySQL tables.
Therefor download and extract a fresh Tomcat 8 and extract it to $TOMCAT_HOME. Then the mysql-jdbc-connector jar has to be put into $TOMCAT_HOME/lib.
For the access logs, Tomcat 8 already ships a Valve configuration (https://tomcat.apache.org/tomcat-8.0-doc/api/org/apache/catalina/valves/JDBCAccessLogValve.html) in server.xml that can be modified to a JDBC target:

...
        <Valve className="org.apache.catalina.valves.JDBCAccessLogValve"
          	driverName="com.mysql.jdbc.Driver"
          	connectionURL="jdbc:mysql://localhost:3306/tomcatlogs"
          	connectionName="tomcat"
          	connectionPassword="tomcat123"
          	pattern="combined" 
		resolveHosts="false"
          tableName="accesslogs"
        />
...

That’s all for the access logs. Now you can start Tomcat by executing $TOMCAT_HOME/bin/startup.sh and hit one of the pages in http://localhost:8080/ . When executing a Select statement on the accesslogs table you will see the incoming logs in the database instead of files.

mysql> select timestamp, status, userAgent from accesslogs;
| timestamp           | status  | userAgent                                                                                                |
| 2016-01-30 09:09:10 |    200 | Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.97 Safari/537.36 |
| 2016-01-30 09:09:13 |    304 | Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.97 Safari/537.36 |
| 2016-01-30 09:09:18 |    304 | Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.97 Safari/537.36 |
| 2016-01-30 09:09:21 |    200 | Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.97 Safari/537.36 |	

Let’s come to the catalina logs. I would like to take log4j and an improved log4j JDBC appender that is provided by a nice developer: http://www.mannhaupt.com/danko/projects/jdbcappender/doc/index.html
With this optimized JDBC appender you can escape upcoming apostrophs or special chars in logmessages. This is not possible with log4j’s JDBC appender.
For switching Tomcats internal logging from Java util logging to log4j, the Apache team provides a good documentation: https://tomcat.apache.org/tomcat-8.0-doc/logging.html#Using_Log4j

  1. download log4j-1.2.17.jar (Tomcat requires v1.2.x). and put it to $TOMCAT_HOME/lib/
  2. download tomcat-juli.jar and tomcat-juli-adapters.jar (are available as an „extras“ component for Tomcat) and put it to $TOMCAT_HOME/lib/ and replace them also in $TOMCAT_HOME/bin
  3. add jdbcappender.jar (the optimized one) to $TOMCAT_HOME/lib
  4. delete $TOMCAT_HOME/conf/logging.properties from $CATALINE_HOME/conf
  5. create $TOMCAT_HOME/lib/log4j.properties
#mysqlAppender1 --> log4j's original jdbc appender
#mysqlAppender2 --> improved jdbc appender

#log4j.rootLogger = INFO, mysqlAppender1
log4j.rootLogger = INFO, mysqlAppender2

# mysql access with log4j's original jdbc appender
# --> will crash as handling of quotes is not supported
#log4j.appender.mysqlAppender1=org.apache.log4j.jdbc.JDBCAppender
#log4j.appender.mysqlAppender1.driver=com.mysql.jdbc.Driver
#log4j.appender.mysqlAppender1.user=tomcat
#log4j.appender.mysqlAppender1.password=tomcat123
#log4j.appender.mysqlAppender1.URL=jdbc:mysql://localhost:3306/tomcatlogs
#log4j.appender.mysqlAppender1.sql=INSERT INTO catalinalogs VALUES('%d','%p','%c','%m') 
#log4j.appender.mysqlAppender1.layout=org.apache.log4j.PatternLayout

# mysql access with improved jdbc appender
log4j.appender.mysqlAppender2=org.apache.log4j.jdbcplus.JDBCAppender
log4j.appender.mysqlAppender2.dbclass=com.mysql.jdbc.Driver
log4j.appender.mysqlAppender2.username=tomcat
log4j.appender.mysqlAppender2.password=Test_123
log4j.appender.mysqlAppender2.url=jdbc:mysql://localhost:3306/tomcatlogs
log4j.appender.mysqlAppender2.sql=INSERT INTO catalinalogs VALUES('@TIMESTAMP@','@PRIO@', '@CAT@', '@MSG@')
log4j.appender.mysqlAppender2.layout=org.apache.log4j.PatternLayout
log4j.appender.mysqlAppender2.quoteReplace=true

#mapping logger to appender
#log4j.logger.org.apache.catalina.core.ContainerBase.[Catalina].[localhost]=mysqlAppender1
log4j.logger.org.apache.catalina.core.ContainerBase.[Catalina].[localhost]=mysqlAppender2

After restarting Tomcat, a Select on catalinalogs table will give a similar result:

mysql> select dated, priority, message from catalinalogs;
+-------------------------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| dated                   | priority | message                                                                                                                                                                                                                                                                   |
+-------------------------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 2016-01-30 09:35:42.296 | INFO     | Server version:        Apache Tomcat/8.0.30                                                                                                                                                                                                                               |
| 2016-01-30 09:35:42.303 | INFO     | Server built:          Dec 1 2015 22:30:46 UTC                                                                                                                                                                                                                            |
| 2016-01-30 09:35:42.308 | INFO     | Server number:         8.0.30.0                                                                                                                                                                                                                                           |
| 2016-01-30 09:35:42.313 | INFO     | OS Name:               Linux                                                                                                                                                                                                                                              |
| 2016-01-30 09:35:42.318 | INFO     | OS Version:            3.19.0-47-generic                                                                                                                                                                                                                                  |
| 2016-01-30 09:35:42.324 | INFO     | Architecture:          amd64                                                                                                                                                                                                                                              |
| 2016-01-30 09:35:42.329 | INFO     | Java Home:             /home/christoph/jtools/jdk1.8.0_72/jre                                                                                                                                                                                                             |
| 2016-01-30 09:35:42.334 | INFO     | JVM Version:           1.8.0_72-b15                                                                                                                                                                                                                                       |
| 2016-01-30 09:35:42.349 | INFO     | JVM Vendor:            Oracle Corporation                                                                                                                                                                                                                                 |
| 2016-01-30 09:35:42.354 | INFO     | CATALINA_BASE:         /home/christoph/jtools/apache-tomcat-8.0.30                         
…
| 2016-01-30 09:35:44.085 | INFO     | Starting ProtocolHandler ["ajp-nio-8009"]                                                                                                                                                                                                                                 |
| 2016-01-30 09:35:44.091 | INFO     | Server startup in 1439 ms                                                                                                                                                                                                                                                 |
+-------------------------+----------
71 rows in set (0.00 sec)

So far, everything works from Tomcat to MySQL.

Elasticsearch

In order to have a better base for accessing large amounts of data and because it’s working great together, I choose Elasicsearch as source for Kibana’s visualization.
First, download and extract the Elasticsearch package to $ES_HOME.
For this scenario I let the default values for most of the configuration keys. In case of Elasticsearch I only modify the name of the cluster:

cluster.name: tomcatlogs-cluster

Next step is to install a frontend („Head“ from https://mobz.github.io/elasticsearch-head/) for simple access. Therefor Elasticsearch provides a nice plugin mechanism:

$ $ES_HOME/bin/plugin install mobz/elasticsearch-head

That’s all, now the ES instance can be started by running $ES_HOME/bin/elasticsearch
Then the Head can be called here: http://localhost:9200/_plugin/head. You will see no indexes yet.

ES JDBC

After setting up MySQL and Elasticsearch I will have a look at the bridge between them. The „JDBC importer for Elasticsearch“, once a plugin for ES now an indipendently running component, was designed to bridge the two worlds.
Download and extract it to $ESJDBC_HOME. You can delete all *.bat, *.sh, *.dump files and the travis directory. Basically this component is a (cron-triggered) runner that echoes data from a JDBC connection into the index of an ES instance.
Create following file and make it executable (chmod+x):

#!/bin/sh

bin=./../bin
lib=./../lib

echo '
{
    "type" : "jdbc",
    "jdbc" : {
        "schedule" : "0/10 0-59 0-23 ? * *",

        "url" : "jdbc:mysql://localhost:3306/tomcatlogs",
        "user" : "tomcat",
        "password" : "tomcat123",
	"sql" : [
            {
                "statement" : "select * from accesslogs where timestamp > ?",
                "parameter" : [ "$metrics.lastexecutionstart" ]
            }
        ],

	"elasticsearch.cluster" : "tomcatlogs-cluster",
	"elasticsearch.host" : "localhost",
	"elasticsearch.port" : "9300",

	"index" : "accesslogs",
        "type" : "accesslogs"
    }
}
' | java \
    -cp "${lib}/*" \
    -Dlog4j.configurationFile=${bin}/log4j2.xml \
    org.xbib.tools.Runner \
    org.xbib.tools.JDBCImporter

Actually this is the heart of this example where the mapping from relational database to elastic search cluster is described.
The execution of this script will poll every 10sec the mysql, looking at the timestamp and adding all the new stuff to Elasticsearch cluster with index accesslogs.
This will continously populate the ES cluster with the data that is inserted by Tomcat into mysql’s accesslogs table.
Before starting the script, the target index has to be created in ES via

curl -XPUT 'http://localhost:9200/accesslogs/'

After that the ES-Head at http://localhost:9200/_plugin/head/ will show the new created index called accesslogs.

Kibana

Finally we need a component that takes over the work for visualization. Kibana is in connection with ES a great solution for presenting large amounts of data in different and highly configurable styles.
Download and extract it to $KIBANA_HOME. We let everything on default, so it finds and connect to the ES instance when running $KIBANA_HOME/bin/kibana.
After short bootstrap, the frontend is reachable at http://localhost:5601/ and you will be requested to configure an own index pattern because by default Kibana is used with logstash:
kibana_1
After Kibana is connected to our index, we can start visualizing whatever we want. The first step is to choose „Visualize“ from the top menu and then choose a matching type, in this case a pie chart:
kibana_2
After that you have to choose if you want to reuse an existing search or (as in our case) a new one:
kibana_3
Then in the next step you choose which fields from the index shall be used. A diagramm is constructed on the fly. In this case pie chart with slices that represent the status field in the index:
kibana_4
This is really quite nice and a useful extra configuration is the time-triggered updating of the page in the right upper corner:
kibana_5 With that feature you can put a monitor somewhere in the middle of the office and everybody can see in realtime updated status of your application 🙂