CuckooMX: Automating Email Attachments Scanning with Cuckoo

Today,Â classic anti-virus protections are not enough reliable to protect against modern malwares. To have a better understanding and, if possible, block them, it’s best to execute the code in a safe environment and to analyze its behaviour. Does it create new processes or files, are outbound connections performed via suspicious domains or IP addresses? Does it implement hooks? This method of performing malware analysis in a sandbox is more and more common. As usual, they are vendors providing nice solutions (but often very expensive) and free (open source) alternatives. The most popular is called Cuckoo. I won’t explain in details what is Cuckoo and how it works. The project maintainer (Claudio Guarnieri) made a great presentation during the last Hack in the Box in Amsterdam. His slides are available here. Of course, I’m a Cuckoo user! I use malwr.com but I also have my local Cuckoo instance running on my Macbook with my own guest images.

If the method looks sexy, the day-to-day usage of sandboxes remains a pain! You need to grab a copy of the malware, transfer it to the sandbox, execute it, wait (!) and interpret the results. We need more automation! Today, emails remain a key attack vector to distribute malwares but also they are spread using documents (PDF, Office, Flash), as explained in this Sophos blog post “The Rise of Document-base Malware“. Yes, I’m a lazy guy and I would like to have all documents passing through my MTA being automatically analyzed by Cuckoo. They’re commercial solutions which achieve this. I’m currently playing with some in my job but they are really expensive. Why not try to do the same with free software?Â That’s the purpose of this project called “CuckooMX“.

The principle is easy: Every mail relayed by a MTA will first be sent to Cuckoo for further analysis. If a suspicious file is detected, the mail will remain in a quarantine until results will be reviewed by a “security analyst” (read: a human). If considered as “safe”, the mail will be re-injected in the flow to reach its final destination. The figure below gives a global overview of the solution:

CuckooMX Architecture — (Click to enlarge)

They are some operations to achieve:

Capture the mail flow at MTA level
Extract MIME attachments
If some interesting are found (like PDF files, Zip archives, executable files or Office documents), submit them to Cuckoo
If Cuckoo reports the data to be safe, forward them back to the MTA
Otherwise, more investigation must be performed in the quarantine

The following process must be implemented:

At this time, we are facing a big “issue”: The current version of Cuckoo (0.3.2) cannot be easily configured to flag a piece of code as malicious or safe. Another Cuckoo user wrote a patch to add YARA support to Cuckoo. It works well but a more interesting system will be implemented in the next Cuckoo release (0.4) which is expected to be released soon. Signatures could be implemented as Python classes to easily categorize the malware. Here is an example (copied from Claudio’s slides):

class CreatesExe(Signature):
Â  name = "creates_exe"
Â  description = "Creates a Windows executable on the filesystem"
Â  severity = 2
Â  def run(self, results):
Â Â Â  for file_name in results["behavior"]["summary"]["files]";
Â Â Â Â Â  if file_name.endswitch(".exe"):
Â Â Â Â Â Â Â  self.data.append({"file_name" : file_name})
Â Â Â Â Â Â Â  return True
Â Â   return False

Just a few words about YARA. The goal of this project is to categorize malwares based on textual or binary patterns contained on samples of those families. Here is an example of a YARA signature:

rule Worm_VBS_Uaper_B
{
strings:
Â  $a0 = { 466f72204f353d3120546f204f332e41646472657373456e74726965732e436f756e74 }
Â  $a1 = { 536574204f363d4f332e41646472657373456e7472696573284f3529 }
Â  $a2 = { 4966204f353d31205468656e }
Â  $a3 = { 4f342e4243433d4f362e41646472657373 }
Â  $a4 = { 456c7365 }
Â  $a5 = { 4f342e4243433d4f342e424343202620223b20222026204f362e41646472657373 }

condition:
Â  $a0 and $a1 and $a2 and $a3 and $a4 and $a5
}

With the new Cuckoo version, it will be easy to create powerful signatures based on:

Network behaviour (DNS requests, IP addresses)
File system operations
Registry operations
System calls

In the mean time, CuckooMX submit attachments AND re-inject the mail immediately in the normal flow. There is NO protection against malicious code at the moment! Be warned!

My mail relay is based on an Ubuntu server and Postfix (the default installed MTA). CuckooMX is a perl script which integrates into Postfix and submits data to Cuckoo. How does it work? Postfix is a powerful open source mail server which many ways to be expanded to add features to filter emails. One of them is the called “After Queue Content Filter” (more information about this method is available here). To implement the filter, change your master.cf file like below:

# ====================================================================
# service typeÂ  private unprivÂ  chrootÂ  wakeupÂ  maxproc command + args
#Â Â Â Â Â Â Â Â Â Â Â Â Â Â  (yes)Â Â  (yes)Â Â  (yes)Â Â  (never) (100)
# ====================================================================
smtpÂ Â Â Â Â  inetÂ  nÂ Â Â Â Â Â  -Â Â Â Â Â Â  -Â Â Â Â Â Â  -Â Â Â Â Â Â  -Â Â Â Â Â Â  smtpd
Â Â Â Â Â Â Â  -o content_filter=cuckoomx
[...]
cuckoomxÂ  unix  -Â Â Â Â    nÂ Â Â Â Â Â  nÂ Â Â Â Â Â  -Â Â Â Â Â Â  -Â Â Â Â Â Â  pipe
Â Â Â Â Â Â Â  user=cuckoo argv=/data/cuckoo/cuckoomx.pl -f ${sender} ${recipient}

The first line (smtp) defines a new content filter called “cuckoomx“. This one is defined at the end of the file with information about the execution (under which user, arguments). If required, adapt the user and Perl script path to match your environment. I suggest you to use your existing Cuckoo user to avoid access problems on files. Once done, restart Postfix. Edit the Perl script and change the location of the configuration file (“cuckoomx.conf“) on line 58. The last step is to create/adapt the configuration file. The syntax is very simple:

<!--
  CuckooMX Configuration File
//-->
<cuckoomx>
  <core>
    <outputdir>/data/cuckoo/quarantine</outputdir>
Â Â Â  <process-zip>yes</process-zip>
Â  </core>
Â  <cuckoo>
Â Â Â  <basedir>/data/cuckoo</basedir>
Â Â Â  <db>/data/cuckoo/db/cuckoo.db</db>
Â Â Â  <guest>Cuckoo1</guest>
Â  </cuckoo>
Â  <logging>
Â Â Â Â  <syslogfacility>mail</syslogfacility>
Â Â Â   <sendmailpath>/usr/sbin/sendmail</sendmailpath>
Â Â    <notify>xavier@example.com</notify>
Â  </logging>
Â  <ignore>
Â Â Â Â  <mime-type>text/plain</mime-type>
Â Â Â Â  <mime-type>text/html</mime-type>
Â Â Â Â  <mime-type>image/jpeg</mime-type>
Â Â Â Â  <mime-type>image/png</mime-type>
Â Â Â   <mime-type>text/x-patch</mime-type>
Â Â Â Â  <mime-type>application/pkcs7-signature</mime-type>
Â Â Â Â  <mime-type>video/x-ms-wmv</mime-type>
Â Â  </ignore>
</cuckoomx>

The most important parameters that must reflect your setup are:

<basedir> is the base directory of your Cuckoo instance
<db> is the full path to your Cuckoo SQLite database
<guest> is the VirtualBox guest to use to analyze malwares
<sendmailpath> is the full path to your Postfix sendmail binary (to re-inject safe emails in the SMTP flow)

To avoid a flood of submissions with unsupported files, feel free to create your own ignore list with MIME types you’re not interested in. A best practice is to place this filter behind your classic anti-spam and anti-virus solutions (to reduce the load as much as possible). Keep in mind that using sandboxes may require a lot of system resources. The Perl script requires some Perl CPAN modules:

Archive::Extract
DBI
Digest::MD5
File::Path
MIME::Parser
Sys::Sylog
XML::XPath

From now, every mail received by the script is parsed and MIME attachments are extract in a quarantine directory. If a Zip archive is detected, files are extracted and submitted to Cuckoo! If interesting files are extracted, the MD5 digest is generated and compared to the Cuckoo’s DB to avoid duplicate. All information is sent to Syslog:

Jun 18 23:03:39 cuckoomx cuckoomx[9293]: Processing mail from: "DHL Inc." <status@dhl.com> (DHL Package delivery report)
Jun 18 23:03:39 cuckoomx cuckoomx[9293]: Dumped: "/data/cuckoo/in/9293/msg-9293-1.txt" (text/plain)
Jun 18 23:03:39 cuckoomx cuckoomx[9293]: Dumped: "/data/cuckoo/in/9293/msg-9293-2.txt" (text/plain)
Jun 18 23:03:39 cuckoomx cuckoomx[9293]: Dumped: "/data/cuckoo/in/9293/msg-9293-3.html" (text/html)
Jun 18 23:03:39 cuckoomx cuckoomx[9293]: Dumped: "/data/cuckoo/in/9293/DHL report.zip" (application/zip)
Jun 18 23:03:39 cuckoomx cuckoomx[9293]: Files to process: 1
Jun 18 23:03:39 cuckoomx cuckoomx[9293]: "/data/cuckoo/in/9293/DHL report.exe" already scanned (MD5: d68a6a9c37d000989224abe1b2c5160c)
Jun 18 23:03:39 cuckoomx postfix/pipe[9292]: BFC72441BDC: to=<xavier@example.com>, relay=cuckoomx, delay=0.72, delays=0.38/0/0/0.34, dsn=2.0.0, status=sent (delivered via cuckoomx service)

The rest of the operations remains classic to Cuckoo. Files are submitted directly in the SQLite database and processed. What’s next? I’m now waiting for the next release. I’m writing a daemon which will monitor the results of analyzes (always via the SQLite DB). Once the results generated, it will search for known signatures in the output files and decide what to do. The last step will be the interface to allow the security analyst to accept or reject the mail.

The CuckooMX project is already available on github.com. Feel free to test it and report ideas, comments. Everything is welcome!

19 comments

Onur says:

June 2, 2020 at 13:11

Hi;
This is still working. The only problem is that the extracted file has 600 privileges. It doesn’t mean anything if I change the mode setting. By the way my system is Debian 10 Buster and postfix 3x and Perl5.3.30.
via
Xavier says:

January 22, 2019 at 13:23

Hi,
To be honest, this project is dead for a while and I don’t know if it will work with the latest Cuckoo release. Feel free to try and let me know!
vinay says:

January 7, 2019 at 07:15

Does this feature works in cuckoo latest version 2.0.6 ? Can you please explain what changes must be done in the configuration files in the latest version?
Pingback: Automatic MIME Parts Scanning with VirusTotal | /dev/random
Pingback: Book Review: Cuckoo Malware Analysis | /dev/random
Pingback: Cuckoo Malware Analysis Book - iKONspirasi
Xavier says:

February 25, 2013 at 16:45

Hi Chris,
I’m busy to adapt my code for Cuckoo 0.5 and use the API to submit analyzes.
chris says:

February 25, 2013 at 15:18

Looks awesome! Does your updated version that work with 0.4 consider consider if attatchments are good/bad and decide whether or not to deliver them? If not, any hints on how I would go about implementing that?

Thanks!
Xavier says:

August 2, 2012 at 14:06

I just committed a temporary version which runs with Cuckoo 0.4.
marc says:

August 1, 2012 at 23:15

Will your code change with cuckoo .4 or can we grab from github now?
Xavier says:

June 28, 2012 at 19:11

Hello Michael,
Why Perl? Just because I like it! Less experience with Python…
I’m now waiting the official 0.4 version to continue this project.
Michael Boman says:

June 27, 2012 at 23:14

@Xavier nice work on CuckooMX, any particular reason why choosing Perl over Python? I was thinking about automating email client interactivity and copy the email over to a mailbox which the Cuckoo user/MUA then opens and clicks on everything… Wouldn’t that be neat too? That way you might catch any non-attachment attacks too, like bugs in the organisation’s MUA-product.
Pingback: Interesting Reads: Monday 25th June | Today Security Reviews
Xavier says:

June 23, 2012 at 15:17

Hello Davy,
Yep, we discussed on Twitter about this…
Davy Douhine says:

June 22, 2012 at 15:24

Excellent !
Thanks for this sharing this.

That’s funny because yesterday I have seen Claudio Guarnieri ‘s slides of HITB and when i’ve read “it can be customized to do whatever you want and it can be integrated in larger threat intelligence frameworks” i’ve obsiously thought that smart people would take a look at the integration with common used mail and proxy products. A few minutes later i’ve seen your post…
Xavier says:

June 21, 2012 at 21:56

Hello Alojzy,
The goal is of course to scan only suspicious files only (.exe, .zip, .pdf and office files). My development setup runs on a dual-core with 4GB of memory. Cuckoo can also be fine tuned (size of the VM, timeouts). The performance is not a very big issue (IMHO). Mails behind scanned will be queued and stored in the quarantine. Is it really a problem if you receive a mail with some delays?
Balasubramaniam Natarajan says:

June 21, 2012 at 14:53

Good one, I am looking to implement a similar setup.
alojzy says:

June 21, 2012 at 09:47

Fascinating project! I just have one question regarding the performance – how many e-mails per day\hour do you receive and process with cuckoo and on what kind of equipment?
Good luck and I’m waiting for more!
Seth says:

June 20, 2012 at 21:56

I’m in the process of experimenting with Cuckoo in our environment, to up our detection and mitigation capabilities. Email attachments are an ever-present threat and a non-commercial solution would be ideal. In the past I’ve been aware of a tool called vortex, being used for similar purposes (inline analysis). You may want to take a look at it as you build out your tools.

http://sourceforge.net/projects/vortex-ids/
http://smusec.blogspot.com/2010/03/vortex-howto-series-network.html

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Leave a Reply