I have been using Splunk for a couple of years and am pretty happy with it. At the BbWorld DevCon 2011 some people asked me about it, so I thought I’d write up how and why I use it.
What is Splunk?
Splunk is a log aggregation & searching tool. Okay, so it does way more than just that, but that’s what we use it for. It lets you monitor pretty much anything: a file, a directory, a port, a socket, WMI, whatever; and collect the output. Then it lets you search the output. It is better at searching log files than anything I have ever used. It knows the difference between an Apache log file and an IIS log file and a Tomcat log file, so when you search them it’s aware of the correct formatting to use.
And then there are Apps.
How much does it cost?
Splunk is licensed by how many MB of data you process per day. There’s a free, lifetime license for <500MB/day users; anything over that and you have to pay. Quite a lot. The free version does have one caveat: it doesn't support authentication. So you'll have to firewall your Splunk server after the 30-day trial expires & the free lifetime license kicks in.
What does it run on?
Pretty much anything, it seems.
Installing Splunk on Linux
Because we’re monitoring files on our BlackBoard server, you should install Splunk on a central Splunk server, and also on the Bb app server itself. (If you want, you could do it standalone on your Bb server, but why bother?) Simply download the correct version and install it via RPM. Accept the license agreement, of course!
The default username for a new application installation is admin with a password of changeme.
Configuring the Splunk server
By default, it runs on port 3000; you may want to change that via this command:
sudo /opt/splunk/bin/splunk set web-port 80 sudo /opt/splunk/bin/splunk restart
Also, by default, Splunk does not accept data sent by other Splunk servers (or clients). You’ll need to enable that via this command:
/opt/splunk/bin/splunk enable listen 42099 -auth admin:changeme
(FYI, the default listening port is 9997.)
If you want to accept syslog messages (generally a good thing), run this command:
/opt/splunk/bin/splunk add udp 514 -sourcetype syslog -auth admin:changeme
Configuring your Splunk clients
This is a two-step process. You need to enable forwarding on your Splunk client, but you have an option to enable lightweight forwarding as well. The difference is, when you enable lightweight forwarding, it shuts down the Splunk GUI and removes some unnecessary services. This is a great thing to do on a Splunk “client” machine:
sudo /opt/splunk/bin/splunk enable app SplunkLightForwarder -auth admin:changeme sudo /opt/splunk/bin/splunk add forward-server splunk.bowdoin.edu:42099 sudo /opt/splunk/bin/splunk restart
Splunk is all about inputs. You can get a great of what – and how – Splunk can index data here. In Linux, inputs conf is located in /opt/splunk/etc/system/local/inputs.conf. The syntax is pretty simple, here is an example of what I monitor:
[default] host = blackboard.bowdoin.edu [monitor:///usr/local/blackboard/logs/bb-services-log.txt] [monitor:///usr/local/blackboard/logs/bb-sqlerror-log.txt] [monitor:///usr/local/blackboard/logs/tomcat/catalina-log.txt] [monitor:///var/log/messages]
Note that you don’t have to monitor a series of files: you can monitor a whole directory, or files with wildcards, whatever. (On Windows, those paths would look like [monitor://C:\Logs\foo.log].)
Configuring the *NIX app
As mentioned above, Splunk is very extensible via its application collection. To collect the *NIX statistics (users, resource usage etc.) from a Splunk client, add this to /etc/hosts:
139.140.xxx.xxx splunk splunk.bowdoin.edu LOGHOST
On your Splunk client, add this to /etc/syslog.conf:
On your Splunk client, restart the syslog daemon:
sudo /sbin/service syslog restart
What do I do now?
Save a search. Create a dashboard. Make an alert. Go forth!