After a short night due to social events and business related tasks, I joined the Google offices to follow a bunch of interesting presentations. If Botconf offers a great set of presentations, that’s also a good place for networking and to talk about infosecurity topics while having very nice food! Here is my wrap-up for the second day which was of the same quality as yesterday.
The first one was about DGA: “DGArchive – A deep drive into domain generating malware” by Daniel Plohmann. As usual, it started with a review of the DGA, nothing new basically (it was already covered yesterday). Last year, Daniel made a lightning talk about his project and, today, he presented the results of his research.
A small history of DGA:
- The first one was in 2006 (Sality which dynamically generated a 3rd-level domain part)
- In July 2007, Torpid and Kraken were discovered
- In 2008 – 2009, Szribi and Conficker
DGA is a key feature in modern malwares. Why is it so broadly used?
- Aggravation of analysis (make this more difficult)
- Evasion (to avoid blacklisting)
- Asymmetry (attackers needs only one when defenders must block all)
- Feasibility (domains are cheap)
And, more important, they are annoying security researchers! The idea of the research was to reverse DGA, generate all of them and build a database to perform queries and statistics. The goal behind this was to look for a domain and the database returns the associated malware. Until today, Daniel identified:
- 43 families
- 280 seeds
- 20+M domains
Many DGA uses long domains names (as the opposite of business domains which must be as short as possible). An important element are seeds that influence the generation of domains. The process implemented by Daniel is:
- Matching (automatically detected new seeds)
In the next part of the presentation, Daniel explained how domains are generated per malware family with lot of details. Then, the next question was: What about the domain registration? Based on whois databases, he was able to identify characteristics of domains, sinkholes, mitigations, pre-registration, domain parking, etc. The question about DGA is: are they reliable? What about collisions between algorithms and are they risks to generate valid domain names. In this case, this could have a disastrous effect for the owner of the valid domain! Yes, collisions are possible. But not enough to help to classify the malware based on the generated domains.
The next talk focused on the Andromeda botnet. Jose Miguel Esparza presented “Travelling to the far side of Andromeda”. This was not a talk about reversing the bonet (because a lot of information are already available) but more a talk about the people behind it.
A few words about Andromeda:
- It started in 2011
- It is modular and versatile
- Ping C&C regularly asking for “tasks” (new malwares, plugins, etc)
- Spread via classic ways
- Current version is 2.10
It evolved with new features like anti-analysis and a list of blacklisted process (even python.exe and perl.exe). Parameters are send in JSON with the last release. Note also that communications occur over XMPP and not IRC anymore to rebuild the binaries (standard communications remains over HTTP). Malware developers also leave some messages from time to time like fake urls containing “fuckyoufeds” :-). An interesting feature: it does not infect computers located in some regions like Russia (the localization is based on the keyboard layout). There is a real business behing Andromeda and the botnet is sold with terms of service that are worse then the ones of Google of Facebook! Here is an idea of the current prices:
- A bot v2.x: $500
- To rebuild the bot: $10
- SOCKS5 module: free
- Formgrabber module: $500
- Key logger module: $200
- TeamViewer module: $500
And the botnet is still alive, some statistics:
- 10750 samples
- 130 botnets
- 474 builder IDs
- 42K C&C URLs
Conclusions: the project is still alive and the business ongoing. It is used by serious criminal gangs and has interesting custom plugins. A nice overview!
After the morning coffee break, Nikita Buchka and Mikhail Kuzmin presented “Whose phone is in your pocket?”. Android is a nice target for malwares. In Q3 2015, 1.5M+ malicious apps were detected. A trend is the attacks that use superuser privileges.
Most malicious apps are adware, the infection occurs via trojanized ads. They explained how the advertisement model works on Android and, most important, how it can be abuse by attackers. Even if a campaign is abused to spread malwares, brands are still happy because they are promoted, so what? Most adware try to root the devices to get persistence. How? The security model of Android is based one:
- A RO system partition
But they are problems:
- Binder IPC mechanism -> data can be hijacked
- Root user exists … and it can break the model.
“zygote” is a daemon whose purpose is to launch Android apps. To install a malware, the procure is based on the following steps:
- Obtain root access (easy on old versions)
- Remount the system partition in RW mode
- Install the malicious apk
- Remount it in RO mode
When adware is not enough, other malicious code can be installed. A good example is Triada: it comes with SMS trojan, banking trojan, update module, communication with C&C. They explained how the malware infects the device. And what about mitigations?
- The malware cannot be uninstalled (RO partition)
- One solution is to “root” your own device (not recommended)
- Flash a stock firmware (not easy without technical skills + lost of data
- Dealt with?
The next talk topic was again DGA: “Building a better botnet DGA mousetrap: separating mice, rats and cheese in DNS data” (Josiah Hagen).
The fourth (4!) talk covering DGA… I think that we are now aware of this technique to obfuscate communications between bots and their C&C… Just that this time, it involved machine learning. A private joke started in the afternoon about a potential name change from “Botconf” to “DGAconf“…
Apostolos Malatras did an interesting talk about mobile botnets and more precisely, about building a lab to study them (“Building an hybrid experimental platform for mobile botnet research”). If the previous talk focused on how malwares compromised Android devices, this talk reviewed how botnets installed on those infected devices work. In fact, it’s the same as a regular botnet: devices are waiting for commands from the botmaster.
Keep in mind that mobile devices are also computers, they have the same features but they contain a rich set of information about the owner (read: lucrative gains ahead). They are also connected to other computers, to corporate networks. They have nice sensors and more and more are used as mobile wallets! The technical particulates of mobile devices are:
- They use dynamic ip addresses
- There are many constraints by mobile networks
- There are a lot to of different os versions (is it really bad?)
- The size of the screen can be a vulnerability (did the user click on the right link?)
- Sensors can be used as side channels
About the botnets, they are different architectures: centralised, hierarchical , hybrid and P2P. Those must be covered in the lab. This one is must meet certains goals: it must be generic to support many experiments, it must be scalable, extensible and with a sufficient usability. Apostolos reviewed the different components of the architecture (Java technologies, Android emulator, Android debug bridge, XML configuration files and Sensor simulator to create events). The goal is to test mobile botnet and observe their operations. It also execute events based on scenarios: What happend when the mobile does that or that… The next question which comes immediately in mind is: “how long will it take before mobile malware developers will implement test to bypass Android emulators? In fact it’s quite trivial to do (just via the IMEI number!). Have a look at the following paper for more details. Just after Apostolos, Laurent Beslay introduced the “Mobile botnet malware collection” which was more a tribune to the EU services. They are recruiting and started a program to exchange information about mobile botnets.
After a delicious lunch, Paul Jung came on stage to present “Box botnets”. Good news: no IDA slides in his presentation! 🙂 All the story started with a strange HTTP request seen in a log file!
Attackers try constantly to infect websites with malicious scripts hidden in other files like GIF files. This code is often obfuscated using str_rot13(), gzuncompress(). Decoding them is easy using online tools like ddecode.com/phpdecoder. Important warning from Paul: most sites which provide online services like this one keep a copy of the data you uploaded. Keep this in mind if your data are sensitive! So, how to infect a host? The scenario requires:
- A PHP enabled UNIX web server
- A weak CMS
- A direct access to the wild Internet (for back connections)
Based on this description, popular targets are VPS! Then also implement some tricks like change the process name, they intercept all signals preventing the process to be killed. They also always have a “snitch” function to leak server info via email or a specific HTTP request. Once infected, the machine being part of the botnet can:
- Execute stuff (difficult with modern distro which runs the webserver under its own user)
- Perform maintenance tasks (change channel, rename bot)
- Send spam
- UDP/TCP/HTTP flooding
- And… seek for other servers to compromise!
By using multiple search engines (Paul found 37 of them!), they search for new potential victims. The next part of the talk focused on who’s behing such bots. The team is called Toolsb0x. This is not a state of the art way to compromize computers but… it still works!
Then, we switched to a deeper talk with many assembler code: “Malware instrumentation: application to Regin analysis” by Matthieu Kaczmarek. Modern pieces of malware are very complex today. Why Regin? Because it’s a botnet. The network topology is a botnet.
Keep in mind that communications can be performed in a mix of UDP, TCP, cookies, files, USB sticks… You also need a window open to the world. On top of the network, there is a trust overlay. Each node has a private key and a list of trusted public keys. In the botnet, each node has also a virtual IP address. The design is a service oriented architecture with:
- An orchestrator
- Core modules (take care of crypto, compression, VFS, networking, etc)
- Additional modules (probes, agents, etc)
After multiple slides explaining the techniques behing Regin, Matthieu gave a demo of a communication between two Regin modules… The demo was the exchange of a “hello” message between two nodes. It looked so simple but the amount of time and effets spent to reverse all the stuff is so huge! An impressive work!
After the coffee break, Mark Graham presented “Practical experiences of building an IPFIX based open source botnet detector”. What’s was Mark’s problem: How to effectively detect botnets in cloud providers? According to Mark, the cloud is a nice place to look for botnets activities. The first part of the talk was an introduction to IPFIX (which honestly I was not aware of!).
Everybody knows Netflow (created by Cisco in 2009) but IPFIX is almost unknown (based on the Botconf audience). What are the issues related to Netflow?
- Host escape
- Intra VM attacks
- VM escape
In 2013, IPFIX was invented. A big advantage of IPFIX is the required storage. Mark did some tests and a file transfert resulted in a 3.1GB PCAP file but… only 43KB IPFIX file! PCAP can be compared to phone call where IPFIX can be compared to the phone bill (who, when, how long). More precisely, IPFIX was developed to fix the following issues:
- Vendor independent
- Multiple protocols (not only UDP)
- More security
- Ready for next generation (IPv6, multicast, MPLS)
The second part of the talk covered the development of sensors based on Xen & OVS (Open vSwitch). Mark explained the issues he faced with the different version of the required components. Once built and configured, the next issue was to find the right location to connect probes. The visibility of the network is a key! Once the right number of probes connected at right places, we can find useful information but there are still limitations to the system:
- Deep packet inspection (discarding the payload as a cost…)
- Encryption / VPN traffic : payload is not an issue but PDU headers within a VPN tunnel has an impact
The solution proposed by Mark was to create an extended template now with DNS and HTTP parameters (like cookie, age, via, referer). A nice talk which make me learn about IPFIX!
The next presentation focused on the threat landscape in Brazil with Tal Darsan (“The dirty half-dozen of the Brazilian threat landscape”). What’s going on in Brazil today? They use Delphi, VB script and C#. They are using packers: CPL and VBE trends. Themida packer. They have a unique fraudster underground community, a comprehensive attack vectors with a naive approach and they bundle legit tools for malicious purposes. You can buy trainings to learn how to fraud. So, what are the most popular vectors?
- Image based phishing attacks. Tal explained the Boleto attack.
- Fake browsers: used to steal bank credentials (dropper is delivered via a small size downloader (banload)
- Overlay attack (similar to the fake browser – create an overlay of the browser content – browser not replaced)
- Remote overlay : MITM attack created with… VNC
A nice review of the threats in Brazil! For your information, here is a website where you can buy services to learn “hacking“: http://www.hackerxadrez.com.br/
Ya Liu presented the last talk for today: “Automatically classifying unknown bots by the register messages”. The idea of this research was to categorise botnets based on the messages then exchange with their C&C once a new computer is infected.
As many variants of malwares are discovered daily, new techniques must be found to classify them. Most malwares can be grouped into well-known families like zbot or darkshell. They have one point in common: they need to communicate with C&C. The idea of Ya was to analyze how they register themselves to the C&C (the first action performed after a successful infection). Register messages contain information like hostname, IP, CPU, OS, version, etc. Ya reviewed how such information is encode and sent to the C&C. Interesting research!
The second day finished with a session of lightning talks (12 x 3 minutes of speed talks) just before the usual social event. This year is was on the top floor of the French National Library with a very nice view on Paris by night: