Yesterday, a very interesting article was published on the MISP blog by my friend Koen about a solution to monitor a MISP instance with Cacti. Monitoring your threat intelligence platform is always a good idea because many other tools depend on it. You can feed other tools with MISP data and, if MISP is not running, you will probably break your detection capabilities!
Cacti is a great tool that I used in the past but for many reasons, I switched to another solution to monitor my infrastructure. My current setup is running on Centreon. Being a very old Nagios user for years, I like Nagios-alike solutions because you can quickly extend your monitoring tool by adding your own plugins written in any language. They just have to return the correct information to be processed by the engine. They are also many tools that ingest Nagios plugins as-is: Centreon, Icinga, Zabbix, etc…
Koen’s blog article gave me the idea to do the same but… based on Nagios plugins. I wrote (well, I copy-pasted Koen’s code) two Python scripts to fetch useful data from a MISP instance:
./check_misp_workers.py -h usage: check_misp_workers.py [-h] [-u MISPURL] [-k MISPKEY] [-w MISPWARN] [-c MISPCRIT] Nagios compatible plugin to monitor MISP workers optional arguments: -h, --help show this help message and exit -u MISPURL, --url MISPURL MISP URL -k MISPKEY, --key MISPKEY MISP API Key -w MISPWARN, --warning MISPWARN MISP Warning Threshold -c MISPCRIT, --critical MISPCRIT MISP Critical Threshold # ./check_misp_stats.py -h usage: check_misp_stats.py [-h] [-u MISPURL] [-k MISPKEY] [-w MISPWARN] [-c MISPCRIT] Nagios compatible plugin to monitor MISP events/attributes optional arguments: -h, --help show this help message and exit -u MISPURL, --url MISPURL MISP URL -k MISPKEY, --key MISPKEY MISP API Key -w MISPWARN, --warning MISPWARN MISP Warning Threshold -c MISPCRIT, --critical MISPCRIT MISP Critical Threshold
The first plugin returns the status of MISP workers and triggers an alter is one or more workers are dead (which is a classic issues with MSIP ;-). The second one returns useful statistics about the MISP instance: Events, attributes, users and organisations. It triggers an alert when there is a peak of attributes created (which can also lead to performance issues).
Here is what it looks like in my Centreon:
The scripts are available in my Github repository:
If you compare my monitoring with the one implemented in Cacti, there is less information (no CPU, memory, etc) because my MISP instance is running in a Docker and I’m monitoring the health at the Docker engine level!