We reached December, it’s time for another edition of the Botconf security conference fully dedicate to fighting botnets. This is already the fifth edition that I’m attending. This year, the beautiful city of Montpellier in the south of France is hosting the conference. I arrived on Monday evening to attend a workshop yesterday about The Hive, Cortex and MISP. As usual, I’m following the talks to propose you a wrap-up. Let’s go for the first one!
The introduction was not performed by “The Boss” (Eric Freyssinet) who was blocked due to a last minute change in his work agenda. But, the crew was there to ensure a smooth event. What about the current edition? In a few numbers: 4 days, 3 workshops, 12 crew members, 300 attendees (+13%), 28 talks selected amongst 46 submissions and good food as usual. Some attendees already renamed the event in “Bouffeconf” (“bouffe” is a French expression which expresses a huge amount of food). They also insisted on the respect of the social network and TLP policies.
The keynote slot was presented by Sébastien Larinier and Robert Erra. The title was “How to Compute the Clusterization of a Very Large Dataset of Malware with Open Source Tools for Fun & Profit?†and presumes a talk being oriented to machine learning. And it was indeed the case, the word appeared quickly on a slide. It was quite hard for a keynote with many mathematics formulas. The idea behind Sébastien and Robert’s research was to solve the following problem: Based on a data set of a few millions of malware samples how to process them automatically to classify them in clusters or families and get more information about their differences. In such a complex task, the scalability is important but also the speed. The schema to process the samples is the following:
blob >> parser >> JSON data >> FV (Features Vector) >> Classification
They explained the available algorithms (KMeans and DBScan) and their differences. Read the links if you are interested. Then they explained the issues they faced and finally gave some statistics.
They also explained the architecture deployed to parse all those samples. But what is stored? A lot of information: Hashes, the size and number of sections, names, entropy, characteristics, resources, entry point, import/export tables, strings, certificates, compilation date, etc. It is a good research that is still ongoing. Note that Sébastien has a workshop on this topic that he’s giving here and there at security conferences.
The first talk was titled “Get Rich or Die Trying†by Or Eshed & Mark Lechtik from Checkpoint. It started with a fact: Many researches started with a simple finding like an email… that is the “triggerâ€. In this case, the research performed by Checkpoint started from an email about an oil company (Aramco) and targeting Saudi Arabia. Was it an APT? The investigations revealed step by step that it was not really an APT. They explained every step of the case from the email to the different malware samples delivered via malicious Office documents.
One of them was a NetWire Lite, a RAT sold by wordwirelabs.com. The second sample was a VB6 compiled program which was an info stealer (ISR Stealer). The next one was an HawkEye keylogger which steals FTP, HTTP, SMTP credentials but also… Minecraft!? Don’t ask why! These tools are definitively not present in an APT… So they degraded the incident level. While going further, they finally found the Nigerian guy behind this attack. The main conclusion at the end of this talk could be: This guy was able to create a big operation and to cause damages with limited skills set. What about a group of highly skilled people?
The next slot was assigned to “Exploring a P2P Transient Botnet – From Discovery to Enumeration†from Renato Marinho, a researcher and SANS ISC handler. Renato explained how he found a botnet and how he was able to reverse the communications with the C&C. How it started? Simply with a Raspberry Pi running a honeypot at his home. The device was quickly infected (using the default Pi credentials) and he saw that the device tried to established a lot of connections to the Internet. Tip: when you’re running a honeypot, block (but log!) all connections to the wild Internet. He found that each member of the botnet could be a “Checker” or a “Scaro“, just one of them of both at the same time. A “Checker” is a dump node while a “Scaro” is a C&C. Communications with the C&C were established via HTTPS but the certificate was found in the binary. In this case, it’s easy to play MitM and intercept all communications. The set of commands was quite limited (“POST /ping”, “GET /upgrade”). The next step was to estimate the botnet size. The first techniques were to crawl the botnet based on the IP addresses found in communications with the C&C. The second one was more interesting: Renato found that it was possible to become the botnet by changed some parameters in the communication protocol (this is easy to achieve via a tool like BurpSuite). Another interesting fact about this botnet: there was no persistence mechanism in place which means that a reboot will remove the malware… until the next infection! Very interesting research!
Then, Jakub KÅ™oustek, Peter Matula, Petr Zemek, from Avast, presented a very nice tool called RetDec. This is an open-source machine code decompiler. The first part of the talk was easy to understand. When a program (source) is compiled, the compiler generates machine code but also optimizes and changes reorganizes how data is managed. When you use a decompiler, you’ll get a code that is readable but that is far away from the original code. Usually, unreadable. They are also other techniques that make decompilation a hard work: packers, obfuscation, anti-debugging techniques, etc. RetDec is trying to solve those issues… The goal is to make a generic decompilation of binary code. That was the easy part. In the second part of the talk, they explained in details how the decompiler does the job with many examples. It was really complex. I just trust them. RetDec can do a good job. The good news is that it will be released as an open-source project next week. Check on retdec.com for more details. A good point for the IDA debugging plugin that can interact directly with RetDec! Impressive work by the Avast team…
After a long half-day, the lunch break was welcome. The afternoon started with “A Silver Path: Ideas for Improving Lawful Sharing of Botnet Evidence with Law Enforcement” by Karine e Silva from the University of Tilburg, NL. Not a technical talk at all but Karine has a very good overview of the issues between security researchers and law enforcement agencies. Indeed, by the law, attacking people or getting access to non-authorized data is prohibited. But in case of a botnet (just an example), the help of the researcher could be positive to help the LEA to take down the C&C server. The project presented by Karine is called BotLeg (more information here):
The project is a consortium between TiU (TILT), SURFNet, SIDN, Abuse Information Exchange, and NHTCU. While the main focus of the research is the Netherlands, the project will develop a comparative analysis to include other EU countries. The project is financed via NWO and will last for 48 months. Among the expected legal research results, the BotLeg project will deliver sectorial guidelines and codes of conduct on anti-botnet operations.
Some points are quite difficult to address. Example: in some cases, hack back is allowed but must be performed with the same level as the original attacker did. That’s not easy to quantify. What as an “aggressive” attack? Of course, the GDPR was mentioned because researchers are also collecting sensitive data.
The next talk was presented by two guys from the CERT.pl (JarosÅ‚aw Jedynak & PaweÅ‚ Srokosz): “Use Your Enemies: Tracking Botnets with Bots“. Usually, bots are used for malicious activities but they can be used for many purposes. Collected data are used to identify and kill them. They explained the infrastructure they developed to analyze malware samples, decrypt C&C configurations and then act as a member of the botnet to gain more knowledge. Their Ripper is, in fact, a modified version of Cuckoo + homemade scripts.
Interesting to notice that performing this can be directly related to the previous talk: personal or sensitive information can be found. Once information about the botnet discovered, it’s not always easy to infiltrate it because you need to look legitimate (hostname, behavior, uptime), wait some time before being able to fetch data, and sometimes configuration is one available on specific countries.
The next talk was similar to the previous one. It focused on SOCKS proxies. “SOCKs as a Service, Botnet Discovery” by Christopher Baker. IP addresses can be easily classified. They are blacklists, GeoIP databases, DNS, CGN, websites etc. It’s easy to block them. But some IP addresses are very difficult to block because it could affect too many people (example: cloud services or ISP’s). That’s why there is a (black) market of SOCKS proxies. This is really a pain for researchers or law enforcement agencies because many SOCKS proxies are running on compromised computers in homes. Christopher explained how easy it is to “rent” such services for a small fee. In the second part of his talk, he explained how he infiltrated SOCKS proxies networks to gather more information about them. If I understood correctly, he used controlled hosts to join networks of proxies and see what was passing through them. Like deploying a tor-exit node.
After the afternoon coffee break, Sébastien Mériot from OVH presented “Automation Of Internet-Of-Things Botnets Takedown By An ISP“. For an ISP, DDoS attacks can be catastrophic. Not only they suffer from DDoS but some C&C servers can be hosted inside their infrastructure and, regarding the law, they can’t have a look at their customers’ data. Working based on abuse reports isn’t useful because it generates a lot of noise, they are often incomplete or the malicious content is already gone. IoT botnets have been a pain during the last year and generate a lot of DDoS attacks. Finding them is not complicated (Shodan is your best friend) but how to recover information about the C&C servers? Sébastien explained how he’s performing some reverse engineering to extract juicy information. I like the way he uses Radare2 with the r2pipe to get the assembly code of the sample and perform some greps to search for patterns of assembly code handling domains or IP addresses.
Then, Pedro Drimel Neto (Fox-It) came on stage to present “The New Era of Android Banking Botnets“. It was an interesting review of some banking malware families that spread during the last years: Perkele, iBanking, GMbot and BankBot. For each of them, he reviewed the infection path, the C&C communications, the backend. If in the previous years, unencrypted communications occurred via SMS, today it’s quite different and the latest malware families are much better (from an attacker perspective): strong encryption, anti-analysis, packing, C&C communications, e, c. Also, the distribution methods changed.
The last talk was an excellent review of the Gooligan botnet: “Hunting Down Gooligan” by Elie Bursztein & Oren Koriat. What is Gooligan? It was the first large-scale OAuth stealing botnet. Being used by all major actors on the Internet (Google, Microsoft, Facebook, Twitter, etc) you can imagine the impact of this botnet. The first version was detected in 2015 by Checkpoint and it was taken down in November 2016. In a nutshell, it was distributed as a repackaged known APK.
Once decoded, the payload is downloaded, devices are rooted and persistence is configured. It modifies the install-recovery.sh file used when resetting the phone to factory settings. It makes very difficult to get rid of the malware. After technical details, the speakers explained the monetization techniques used by the botnet. There was two: apps boosting and ads injection. Stolen OAuth tokens were used interact with the play store to generate fake installs, reviews and search. Indeed, real users on real phones are difficult to spot compared to “fraudulent†server. As the C&C server got all details to spoof the infected phones (IMEI, IMSI, brand, model, token, Android version, etc). The last step was to explain how the remediation was performed: The C&C server was sinkholed and stolen token revoked. All users were notified, which is a challenge based on the number of people (1M), different languages, technical skills etc. I really like this presentation.
The day finished with beers and pizza in a relaxed atmosphere. Stay tuned for a second wrap-up tomorrow!