Log Awareness Trainings?

ChuckawareMore and more companies organize “security awareness” trainings for their team members. With the growing threats faced by people while using their computers or any connected device, it is definitively a good idea. The goal of such trainings is to make people open their eyes and change their attitude towards security.

If the goal of an awareness training is to change the attitude of people, why not apply the same in other domains? Log files sounds a good example! Most log management solutions prone to be extended to collect and digest almost any type of log files. With their standard configuration, they are able to process logfiles generated by most solutions on the information security market but they can also “learn” unknown logfile formats. Maaaagic!

A small reminder for those who are new in this domain. The primary goal of a log management solution is to collect, parse and store events in a common format to help searching, alerting or reporting on events. The keyword here is “to parse“. Let’s take the following event generated by UFW on Ubuntu:

Apr 10 23:56:17 marge kernel: [8209773.464692] [UFW BLOCK] IN=eth0 OUT= \
  MAC=xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx:xx SRC=11.12.13.14 DST=88.191.132.217 \
  LEN=60 TOS=0x00 PREC=0x00 TTL=42 ID=36063 DF PROTO=TCP SPT=32345 DPT=143 \
  WINDOW=14600 RES=0x00 SYN URGP=0

We can extract some useful “fields” like: the source IP address and port, the destination IP address and port, a timestamp, interfaces, protocols, etc. Let’s come back to our unknown logfile format! The biggest issue is our total dependance of the way developers generate and store the events. If events are stored in a database or if  fields are delimited by a common character, it’s quite easy: we just have to setup a mapping between the source and our standard fields:

if ($event =~ /(\S+),(\S+),(\S+)/) {
  $source_address = $1;
  $source_pot = $2;
  $dest_address = $1;
  # ...
}

Alas, most of the time, it’s more complicated and we have to switch to complex regular expressions to extract juicy fields. And the nightmare begins… I had to integrate events generated by “Exotic Product 3.2.1” because “Its events are interesting for our compliance requirements“. Challenge accepted!

  • Step one: Find a documentation which describes the log format and examples. Of course, there isn’t!
  • Step two: Grab some log samples. But how to be certain that all event types are covered? Critical events occur rarely and are difficult to be generated in a production environment (who said “Do it in a test lab?” – which lab?)
  • Step three: Read the samples and try to understand the logicmess behind the events
  • Step four: Write unreadable regex to match as much events as possible
  • Step five: Deploy, test and run without looking back!

Of course, there are chances that your regex will fail after an upgrade from 3.2.1 to 3.2.2 because developers decided to change some messages. This bad scenario is real! By telling this, I would like attract the attention of developers. Guys, could you please not only write logs but write good logs? In the example above, I faced the following issues:

  • Mix of tabs and spaces, lines ending with or without a dot
  • Mix of single and multiple lines events
  • No standard format
  • Some colon followed by spaces, others not
  • No standard timestamps
  • No timezone information
  • Lack of documentation/support

The primary goal of logfiles is to be able to help sysadmins, network admins or security team during investigation or debugging phases. When something occur, the first place that people will look at are logs! I’m not a developer but I’m playing with logs almost every day. Here are some guidelines which seems important for me:

  • Don’t “re-invent the wheel“: All Operating System proposes interfaces, tools or APIs to write logs. Think about Syslog on UNIX or the EventViewer on Windows. Even from shell script, you can interact with Syslog:
      $ logger -i -t MYSCRIPT "Test event"
      $ grep MYSCRIPT /var/log/syslog
      Apr 11 10:13:23 shiva MYSCRIPT[11929]: Test

    Modern Syslog can also work over TCP, TLS, split events across multiple files, store in databases. Using provided tools, you won’t have to take care about security (access to logfiles), files rotation, archiving.

  • Write human readable event: “Error: File not found” is better than “Error 0x02“. Error codes are useful to match events while parsed so write both:
      Apr 11 10:23:20 shiva MYSCRIPT[11934]: Cannot open file /tmp/foo (error: 2)

    Using a regex “\(error: (\d+)\)” will help to extract the error code. This interger could be reused to classify events:

      if ($event =~ /\(error: (\d+)\)/") {
        if ($1 >= 4) $level = CRITICAL;
      }
  • Generate relevant content: Your logfile must contain useful information. Don’t flood it with useless data.
  • Timestamps: Always include timestamps in your events! This will prevent your event to saved with a bad time/date is there is a lack of time synchronization between different systems.
  • Use unique ID’s: All your log entries should begin with “MSG-xxx” where xxx is a unique integer. This tag will be easy to search in the source code or in your documentation.
      Apr 11 10:25:21 shiva MYSCRIPT[1453]: MSG-432: Cannot read file /tmp/foo2 (error: 13)

    The “MSG-xxx” will be defined statically in your code.

  • Remain impartial: Don’t write personal comments in logfiles like “wtf“, “f*ck” or “stupid user“, etc. Easter egss are funny but keep in mind that your logs could be send outside your application perimeter and read by other people.

  • Don’t mix regular events and traces: Sometimes your application might require some trace to be generated for debugging purpose. Traces might include dumps, stack traces, process registers, SQL commands, file content. Send your traces to a separate destination.
  • Split technical and human readable logs: If your events must be read by people, why not split them to two distinct location? One where the events are written in “plain English” and a second once in CSV (and more easy to parse by a 3rd party tool).
  • Straight to the point: If the ultime goal of your logs is to be stored in a log managment or SIEM solution, why generate a logfile which will be processed by another software? Generate native events (in CEF or CEE format) and send them to the final destination.
  • Write safe code: Finally, always keep security in mind when dumping user provided content to a log file! (Think about all the injection types)

Enough said, if you are interested, have a look at the OWASP document “Logging Cheat Sheet” (here). This was my Friday tribune to all developers! Happy logging…

2 comments

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.