21 Mar 2012

Packetpig - Open Source Big Data Security Analysis

1 comment Permalink Wednesday, March 21, 2012
Packetpig is an open source project hosted at Github that allows full packet captures and device logs to be analysed. We describe it as Big Data Security Analysis - a way of analysing and applying Network Security Monitoring principles to big datasets.

Packetpig is made up of a series of Pig Loaders (Java Classes) that exposes packets captures so they can be analysed at massive scale;
  • PacketLoader() - opens packet captures and provides access to TCP, UDP and IP headers e.g. Source IP Address, Source Port, Destination IP Address, Destination Port.
  • SnortLoader() - wraps the Snort Intrusion Detection application allowing packet captures to be analysed across a Hadoop Cluster. The loader analyses packets and returns signature, priority, message, protocol and Source IP/Port, Destination IP/Port.
  • ConversationLoader() - links packets to their conversations or flows. The conversation start and end, the way the conversation ended, the number of packets, their size and delay can all be extracted through this loader.
  • DNSConversationLoader() - provides additional functionality for the deep packet inspection of DNS conversations.
  • HTTPConversationLoader() - provides additional functionality for the deep packet inspection of HTTP conversations.
  • ConversationFileLoader() - allows file metadata and files themselves to be extracted from conversations. The file name, extension, libmagic information as well as MD5, SHA1 and SHA256 hashes are returned through this loader. In addition the actual files themselves can be extracted and dumped.
  • FingerprintLoader() - a wrapper for p0f that allows it to operate across a Hadoop Cluster.
  • PacketNgramLoader() - extracts data from each packet in a conversation and breaks it into an N-Gram. Unigram, Bigram and Trigrams are most commonly used however any integer can be passed to the loader.
Google WebGL Globe of Snort Alerts
Loaders are called in Pig files written in PigLatin. Multiple loaders can be used to analyse data. For example you may want to take all sources of attacks and see whether their operating system matches their user agent. This would involve using the SnortLoader(), FingerprintLoader() and HTTPConversationLoader().

Firstly you would parse all packet captures using the SnortLoader() to find the distinct Source IP addresses linked to Snort attacks. Secondly you would parse all packet captures using the FingerPrintLoader() (a wrapper for p0f) that would provide information on the operating system using passive analysis. Thirdly you would parse all HTTP conversations using the HTTPConversationLoader() to extract the User Agent field from all conversations. Finally you would join the data together on the Source IP address to output the analysed data linking attackers to their operating systems and their user agents.

SSH Trigrams Visualised in 3D Space
The Packetpig Loaders are the building blocks for analysing full packet captures. There is nothing stopping you from also integrating device log files if required. The Packetpig project also includes a 3D Globe, World Maps and Line Graphs for time series and NGram visualisation.

All of us at Packetloop hope you enjoy the project and we are happy to accept pull requests if you wish to contribute.


15 Mar 2012

Blackhat Europe Finding Needles in Haystacks (the size of countries)

0 comments Permalink Thursday, March 15, 2012
At Blackhat Europe 2012 I unleashed the subject of Big Data Security Analytics and Network Security Monitoring. The presentation was "Finding Needles in Haystacks (the size of countries)" and you can find the slides on Slideshare or download the [PDF].

I knew the audience wouldn't be familiar with Big Data technologies such as Map/Reduce, Hadoop and Pig but they have a keen sense for the changing nature of attacks - that they are becoming more subtle, complex, blended and frequent. We only need to look at 2011 and the major companies that were exploited in that year.

During the talk I showed the "Let's Enhance" video and stated that it was a good metaphor for security analysis. It juxtaposes the hollywood detective with our understanding of the real world. In terms of Security it makes you think of the context you need to find structured attacks against your network. In security we are dealing with a problem of scale and accuracy. Charged with finding needles in haystacks we can barely correctly capture security events. This is why the video is so funny.

These Hollywood 'analysts' have almost magical tools that afford them capabilities we could only dream of as security analysts. They are -
  • Enriching data when we constantly face a  loss of resolution and fidelity.
  • Playing, Pausing and rewinding events but we have one chance and then it's gone.
  • Exploring data in vector space, building context and entropy but we are looking at isolated and disconnected events.
  • Focused on detection however the security industry is still heavily focused on prevention.
  • Investigating  events after they have happened but we are geared towards preventing an unknowable future state.
  • Operating on a complete copy of the event when the best we can often summon is a log or correlated log store.
  • Using algorithms to process features and vectors from data which is a subject that is not even being looked at in terms of security.
So I proposed taking the core concepts from Network Security Monitoring (NSM) and combining it with Full Packet Capture (FPC) and Big Data tools to provide the ability to investigate incidents at mass scale.

We delivered this as an open source Big Data NSM tool called @packetpig, you can find the Github Repository here.

Packetpig can analyse packets at terabyte scale. The data analysis language (like a query language) of Pig lends itself nicely to exploring terabytes of full packet captures. The beauty of Packetpig is you can write a query on your laptop against a small sample of data and then execute the query on the cluster against months or years of traffic captures. Packetpig also comes with a large number of examples.

Packetpig is the first Big Data security tool, it's open source and available for anyone to use. It combines big data analysis with some pretty stunning visualisations. I demonstrated a number of these during the presentation. They included the Google WebGl Globe displaying 420,000 snort alerts across approximately 12 days of full packet captures. I also demonstrated the full capabilities of Packetpig in the areas of threat analysis, traffic analysis and payload analysis including an awesome way of visualising trigrams using an NGram Cube. All of these features will be showcased on the blog over time.

Analysing large data sets gives security analysts new capabilities and this was demonstrated towards the end of the presentation when I used BitTorrent seeders and leechers to triangulate the source of attacks to confirm what IP addresses were common to individual attackers. This involved finding distinct attackers out of 420K individual events (3 Billion packets) and matching it to 180,000 Seeders and Leechers we tracked across Piratebay's Top 100 Movies, Music and Books.

Thanks to everyone that attended the briefing and also those who stayed back to ask questions, discuss their own situations and problems and the capabilities of Big Data Security Analytics.

You can follow @packetpig on Twitter but also download and use the code on your own traffic captures!
14 Mar 2012

Time to validate

0 comments Permalink Wednesday, March 14, 2012
The Mecca for security types is the Black Hat conferences. As security consultants, we always dream of attending, and can only aspire to one day be invited to present a paper at such a renowned event. As the concepts and techniques of Packetloop were evolving, Michael thought that a great way to validate the thinking behind the Packetloop concept, would be to take these ideas before the world's best security minds. So he responded to the call for papers to Black Hat Europe, wondering if they would accept it.

THEY DID! A very humbling, somewhat daunting but mainly exciting opportunity! So in a little over 5 hours from now Michael will be presenting "Finding Needles in Haystacks (the size of countries)" at Black Hat Europe in Amsterdam.

This presentation will give an insight into how to approach security analytics using Big Data - really big data sets, using full packet captures, and leveraging cloud services to process terabytes of data quickly, coupled with powerful visualisations which introduce new ways of understanding your security exposure.

We will post some content from the presentation over the coming weeks including links to many of the resources when Michael returns. All the team at Packetloop wish Michael all the best for the presentation, and the celebrations that will certainly happen afterwards ;)
1 Mar 2012

Welcome to Packetloop!

0 comments Permalink Thursday, March 01, 2012
Hi, about 10 months ago we set out to see if we could create a better way of analysing and understanding the complex security threats that our customers were constantly facing.  Nothing in the market could process the type or amount of data we wanted to review, and that was before we looked at what it would cost our customers to implement.

Packetloop was born (although we didn't know what to call it at that point) and the last 10 months has been chaotic, interesting, and rewarding, all at the same time. The creation of Packetloop has given us an even deeper insight into security and uncovered ways of processing huge amounts of raw data that we didn't think were possible. More importantly, the guys have focused on creating stunning visualisations that present complex security relationships in ways never before seen.

Right now we are putting the finishing touches to the software before we release it in Beta form to see what you think of some of our core features.  For the time being you can check out more about Packetloop at www.packetloop.com, and be sure to register to receive your invite to our Beta.