In preparing for speaking at the Auscert Conference next week I kept thinking about the 'Promise' of Big Data and Security and this includes it's future potential as well as the hope it brings to deliver the next generation and evolution of security products. I am also mindful that part of the audience will think that Big Data is just hype or buzz and as it enters the trough of disillusionment it will be absorbed, quickly forgotten.
Then I re-read the Mandiant APT1 report and posts related to the tactics used then and now. Normally at this time I remember something that Scott Crawford and I have discussed months before and I get a Tada! moment. Big Data Security Analytics is the first technology capable of disrupting the lateral movements of attackers. This is often referred to as the attack lifecycle or the kill chain.
Attack Life Cycles (Kill Chain)
The attack lifecycle or Kill Chain reflects the reality of modern tactics when it comes to a compromise. Some great references to read more about it are:
- “Intelligence-Driven Computer Network Defense Informed by Analysis of Adversary Campaigns and Intrusion Kill Chains” by Hutchins, Cloppert and Amin from Lockheed Martin Corporation [PDF]
- Mandiant's APT1 Report p27 "APT1: Attack Lifecycle" [PDF]
- "A Case Study of Intelligence-Driven Defense" by Dan Guido.
- Command and Control
- Actions and Objectives
Breaking the kill chain can be thought of trying to Detect, Deny, Disrupt, Degrade, Deceive or Destroy these phases.
The dichotomy of Advanced Persistent Threat (APT) is that we know they are not advanced technically but rather the posture of the attacker is advanced in the sense they are prepared to move/think laterally and easily accomplish their goal. Why is that? What is attacker asymmetry?
I find it best to explain to people as an invasion game which we are all really familiar with. If you aren't familiar with the term there's a quick explanation here. For me the greatest invasion game is Rugby a game where defence has improved significantly over the last decade. Defence dominates the game and good defensive lines can totally stifle an attack.
Invasion games pit Attackers against Defenders. The Attackers have a defined goal and can manipulate or avoid defensive lines. Defensive lines can be thought of as passive (on their heels) or active. Active defenses communicate, cover each other and frustrate - their goal is to disrupt, delay and ultimately repel the attack.
Attackers and Defenders are both trying to manipulate time and space. This can be thought of as the speed and structure of their attack patterns as well as how broad or spread they are.
In ‘Security’ defensive lines have generally been inactive or passive at best. When faced with a determined Attacker defensive lines are easily stretched, avoided and where there are collisions the Attacker is able to win them. There is little disruption of the attack lifecycle or breaking of the ‘kill chain’. Furthermore collisions are not sought out by Defenders to create a contest.
Great defensive lines communicate, are knowledgable of attack patterns, move fast off the line, collapse when breached, are able to stretch and reset. They seek to encounter and win their interactions. They manipulate time and space by moving forward to meet the attacker and force attackers into predictable lateral moves that are easy to disrupt.
Understanding attack patterns and seeking out and winning collisions denies momentum and this is the disruption of the kill chain.
Manipulating Time and Space
Pioneering the use of Big Data for Security taught us a lot about attackers and attack lifecycles. We are able to enumerate, enrich, link and build context to understand security events from network packet captures. If I want the deep packet inspection information for every indicator and warning I can get it, if I want to track the specific attributes of an attacker (user agent(s) or operating system) I can. There's literally no limit to the information that can be access, extracted, enriched and linked contextually. Threats, Sessions, Protocols and Files / Security, Network and Threat intelligence.
Big Data tooling and NoSQL data models allowed us to manipulate 'Space' and radically changed the nature of 'Time'. You can zoom from years to minutes, you can understand attacks and attackers in incredible detail but you have to wait - maybe it's 7 minutes, maybe it's 15 minutes but this was the trade off for Big Data or so we thought.
To truly create a next generation security technology Big Data Security Analytics needs to disrupt the attack life cycle or kill chain. This means not just solving the Size and Scale problems of network data and security event streams but also doing this in real time.
A real time Big Data Security Analytics system is broad (laterally), seeks out and wins it's collisions e.g. every interaction with the attacker is expected to be biased towards the defender and enables decisions to be made in real time. These decisions relate to the modelling and disruption of the Kill Chain.
While processing at the speed of the stream (network and security event streams) we can't dismiss the incredible amount of knowledge that is delivered after the fact. It's the reason why we are named Packetloop. There's gold in replaying network traffic and reprocessing files. This information is generally the best information for Kill Chain modelling.
Disrupting the Kill Chain in Real Time
In the previous section I mentioned that there are collisions between Defenders and Attackers. These collisions can be thought of as interactions and where there is an interaction I want it biased in my (defender's) favor. The bias is in terms of information and knowledge regarding the current interaction and how it relates to all other interactions.
So suppose you give me a file (via email or something I was tricked into downloading). This interaction of a single file holds so much information that I can use. I have quickly sketched some of these in the diagram below;
|The 'Jujutsu' of interactions|
So take a file, enrich it, link the information and correlate it based on other information you have from Threats, Sessions and Protocols and you start to see how this could be used to disrupt a Kill Chain. Is it the compile time? Is it the ssdeep hash of the file hidden inside the executable file? Is it a yara signature triggering for shellcode? Is it the emulation of shellcode? Is it the IP address of the web server? the Country it resides in? It's name servers? or the mean distance between those name servers?
When you read the Mandiant APT1 report or similar posts you realise how successful attacks can be when they move laterally. Delivery the file by email, establish C2 communication initially via HTTP (WEBC2) and then later via a more elaborate remote access trojan (RAT). Moving laterally through privilege escalation and further compromise. Data is compressed and encrypted and exfiltrated.
The real time Big Data Security analytics can model this as it happens.
- The email is processed, mail headers extracted to gain the origination IP address of the sender. The text can be analysed for irregularities and sentiment and the attachment extracted and processed.
- Pivoting off the attachment can produce a vast amount of information. Are there files embedded inside the attachment? Is the attachment or files within the attachment known malware when compared to VirusTotal or malware database (e.g. MD5/SHA-1/ssdeep).
- Detonate the attachment in a controlled way using a Malware Sandbox and extract the output communications (DNS, HTTP, IRC, XMPP) for DNS and IP information.
- Determine based on Session and Protocol information indicating that this communication is an outlier.
- Correlate all of this information with indicators and warnings produced by threat management systems.
The correlation is not a JOIN on an IP address it is a probabilistic model that is used to make a decision .. but more on this in a future post.
Although my points look simple there is real math and real science (Machine Learning) that can be applied to this task. This is a light year away from traditional classification and correlation. For example take the modelling of entropy for Metasploits Meterpreter - a payload delivered to remotely access a compromised host.
It's a simple model because I am only looking at two vectors (features). Entropy of the data transmitted and the amount or size of data transmitted. The blue line is the Client to Server entropy and the red dots are Server to Client entropy. This conversation takes place over HTTP and despite some weird URI's it looks like any other conversation to Wireshark.
|Client and Server Entropy for a Meterpreter Session over HTTP|
When I look at the same conversation in approximately 55K conversations of HTTP conversations you can see how even simple features can be used to find outliers. In the figure below I have graphed Client to Server entropy for 54,189 HTTP conversations and that of the Meterpreter session which is also using HTTP. Meterpreter encrypts all communications between the client and the server and therefore has a very high entropy of almost 8 bits per byte.
|The Meterpreter needle in a HTTP haystack|
Big Data Security Analytics is the next generation of real time security products and has real applicability in disrupting the attack lifecycle or kill chain. Simple lateral attacks currently thwart defensive lines because of lack of communication and information sharing, there's no brain to contemplate and mitigate attacks.
I have briefly touched on Machine Learning and it's use in Big Data Security Analytics. I will look to focus on it in some future blog posts. Hope you enjoyed this post! If you did let us know!