Technicalinfo.net Blog: 2011

Wednesday, October 5, 2011

Dialing in the Malware

Despite several decades of anti-malware defense development, the pro-malware industry is still going strong. As I listen to presentations here at VB2011 in Barcelona this week covering many aspects of malware-based cyber-crime and the advances in detection being made, I'm reminded of a recent posting I made on the Damballa site concerning the success of malware. At the end of the day it costs the attacker practically nothing to generate new malware instances and, with a little investment in a QA process, they can guarantee evasion...

There’s often a lot of discussion about whether a piece of malware is advanced or not. To a large extent these discussions can be categorized as academic nitpicking because, at the end of the day, the malware’s sophistication only needs to be at the level for which it is required to perform – no more, no less. Perhaps the “advanced” malware label should more precisely be reattributed as “feature rich” instead.

Regardless of whether a piece of malware is designated advanced or run-of-the-mill, and despite all those layers of defense that users have been instructed to employ and keep up to date, even that ever-so-boring piece of yesteryear malware still manages to steal its victims banking information.

How is that possible?

I could get all technical and discuss factors such as polymorphism and armoring techniques, but the real answer as to why the malware manages to slip by all those defenses is because the bad guys behind the attack tested it prior to release and verified that it was already “undetectable” before it was shipped down to the victim’s computer. Those host-based defenses had no chance.

It’s worthwhile noting that generating “unique” malware is trivial. Armed with a stock-standard off-the-shelf DIY construction kit, it is possible to manually generate several hundred unique variants per hour. If the cyber-crook is halfway proficient with scripting they can generate a few thousand variants per hour. Now, if they were serious and stripped back the DIY kit and used something more than a $200 notebook, they could generate millions of unique variants per day. It sort of makes all those threat reports by anti-virus vendors that count the number of new malware detected each month or year rather mute. Any cyber-criminal willing to do so could effectively choose what the global number of new malware will be and simply make enough variants to reach that target. I wonder if any online betting agencies will offer worthwhile odds on a particular number being achieved. It may be worth the effort.

Armed with a bag of freshly minted malware, the cybercriminal then proceeds to test each sample against the protection products they’re likely to encounter on potential victim’s computers – throwing out any samples that get flagged as malware by the anti-virus products.

Using a popular malware DIY construction kit like Zeus (retailing for $4,000, or free pirated version via Torrent download networks), the probability of any sample being detected even at this early testing stage tends to be less than 10 percent. If the cybercriminal chooses to also employ a malware armoring tool that average detection rate will likely drop to 2 percent or less.

Obviously this kind of testing or, more precisely, Quality Assurance (QA) is a potentially costly and time-consuming exercise. Never fear though, there are a lot of entrepreneurs only too happy to support the cybercriminal ecosystem and offer this kind of testing as a commercial service.

Today there are literally dozens of online portals designed to automatically test new malware samples against the 40+ different commercially-available desktop anti-virus and protection suites – providing detailed reports of their detection status. For as little as $20 per month cybercriminals can upload batches of up to 10,000 new malware samples for automated testing, with the expectation that they’ll receive a thoroughly vetted batch of malware in return. These “undetectable” malware samples are guaranteed to evade those commercial protection products. As a premium subscription service model, for $50 per month, many QA providers will automatically fix any of the malware samples that were (unfortunately) detected and similarly guarantee their undetectability.

Armed with a batch of a few thousand fully-guaranteed malware samples that are destined to be deployed against their victims in a one-of-a-kind personalized manner, it should be of little surprise to anyone precisely why run-of-the-mill or feature-rich malware manages to infect and defraud their victims so easily.

Tuning Spear Phishing Campaigns

I was recently asked to discuss tools and tactics of cyber-crime campaigns in relation to advanced spear phishing tactics. One of the interesting service industries that form the advanced criminal ecosystems is that of ProRing. The following Damballa post summarizes this particular industry...

Despite the advances in anti-spam technologies and mail filtering gateways, if you’re inbox is anything like mine, each morning there will be a bundle of emails offering a cut of some recently liberated or long forgotten monies, offers to work from home (all you need is a US bank account!), notifications of bank detail confirmation requests, or some obscure social engineering whatever. We’ve all seen them, and most of us recognize them for what they are – broad spectrum Internet scam campaigns launched by online crooks.

Again, if you’re anything like me, sometimes you’ll catch yourself laughing at the content of the spam emails. Too often the language is all mixed up, has misspellings, and was obviously written by someone to whom English is a second language).

For the victims, these messages are the start of their problems. For the attackers, the distribution of these messages is roughly a halfway point in their current fraud campaign. For some specialized criminal operators, the content of that email is the culmination of their efforts and contribution.

I was reminded recently by the following very funny (and obviously not serious) tweet that there hasn’t been much attention to the organized crime aspects of translation – in particular, the realm of cybercrime-as-a-service (CaaS).

Figure 1: Humorous tweet in Chinglish with misspellings

It should be no surprise that there are CaaS providers that offer boutique translation services to other Internet criminals.

For quite a few years now there have been folks working behind the scenes translating the content supplied by foreign criminals into the messages arriving in your inbox. I’m not talking about those pigeon-English things you receive and rapidly reject, but rather the ones you’re probably missing based upon a first-pass grammar and spell check. Translation services are rather lucrative for those involved. If you happen to be a fluent English speaker/writer and based in Russia, you can make a couple hundred dollars for each phishing email template you convert or social engineering message you construct. For some CaaS operators a percentage of any fraudulently gained funds may be part of the deal – tying the payment to their translation capability and the success of the attacker’s campaign.

Translating the written language is one thing, it is quite another if you have to speak it. As such, there are a number of CaaS operators that specialize in what could be best described as translation call centers. A common name for these kinds of criminal services are “ProRing” – basically “professional ringing” services, tuned to the requirements of criminals (not just online ones either!).

Supporting a small number of languages, ProRing services are often utilized by cyber-criminals in a variety of ways:

* Account change confirmation for stolen and hijacked accounts

* Money mule coordination and bank account management

* Package tracking and delivery

* Vishing message construction

* Spear phishing “helpdesk” impersonation

* Social engineering

Figure 2: ProRing service supporting multiple languages

The larger more established ProRing providers tend to support the most common languages encountered in Western countries (i.e. English, German, French and Spanish), although other languages may be included – depending upon staffing arrangements and access to external contractors (e.g. Dutch, Serbian, Hebrew, etc.). Several providers also offer male and female speakers.

Rates vary considerably between ProRing providers, but are generally in the realm of $10-$15 per call (made/received), and will increase in price if the speaker does not possess a foreign accent.

The phone numbers being used for the calls will often use callerID spoofing and/or local POP exchanges to hide the international nature of the call. However, it is important to note that many of these ProRing CaaS operators are themselves international and may not necessarily need to obscure their phone number.

Figure 3: ProRing CaaS provider with disclaimers

As with many CaaS providers, ProRing services often come complete with disclaimers and service-level agreements (SLA), which may require financial retainers for participation in longer-running attack campaigns.

So, the next time you’re inspecting your morning email or cycling through those voice-mail messages, you may want to remember that this rapidly evolving cyber-crime ecosystem has your number (literally). Professional ProRing service providers are out there making sure that the next attack is more successful than the last.

Cyber-siege Strategy

The tactical view of cyber-warfare is that of hacking in to systems, infiltrating data and causing systems to self-destruct. It's all a bit Hollywood in many ways, or at least that's the perception of many not intimately involved in dealing with the threat.

I recently wanted to address the strategic concepts of cyber-warfare - in particular the non-destructive aspects of an attack. The first article covering the strategic objectives of modern cyber-war was published yesterday on eSecurityPlanet with the subject "Siege Warfare in the Cyber Age".

In the article I point out the value of non-kinetic attacks and the restoration of device control at the end of hostilities (or regime change), and how future cyber-warfare can take on a siege-like approach.

Monday, August 29, 2011

Predicting Crime Hotspots

There’s a new sheriff in town and he’s riding the horse of “predictive policing”. Back in July the Santa Cruz Police Department began deploying police officers to places where crime is likely to occur in the future – making use of new predictive modeling programs that are designed to provide daily forecasts of crime hotspots – thereby allowing the Department to preempt more serious crimes before they occurred. You can find a story describing how Santa Cruz is sending in the police before there’s a crime in The New York Times.

In essence, this is another physical-world application of machine learning and clustering technologies – applied to preempting a criminal problem. In the cyber-world we’ve been applying these techniques for a number of years with great success. In fact many of the most important advances in dealing with cybercrime revolve around the replacement of legacy IP reputation systems and domain filtering technologies with dynamic reputation systems – systems easily capable of scaling with both the threat and an ever-expanding Internet (e.g. IPv6).

Just last week Manos Antonakakis (a principal scientist at Damballa Labs) presented at the USENIX Security 2011 conference in San Francisco about a new generation of technology capable of identifying domain names being used for malicious purposes weeks, if not months, in advance of malware samples being intercepted, analyzed and “protected” against by legacy anti-virus approaches.

The patent-pending technology utilizes passive DNS observations within the upper DNS hierarchy, and the paper describing the first generation of research (and cybercrime proof-points) can be found in the paper “Detecting Malware Domains at the Upper DNS Hierarchy“. The system running here within Damballa Labs is affectionately known as “Kopis” and has proved its worth time and again preemptively identifying new botnets and cybercrime campaigns – keeping our Threat Analyst team busy with enumerating the real-world criminals behind the domain abuse.

The Kopis system extends many of the principles and research we learnt and formulated when developing the Notos technology – a next generation dynamic reputation system for DNS.

In several ways the Santa Cruz Police Department’s modeling systems approximates an early generation of such a dynamic reputation system – utilizing a mix of long term observations and historical information, combined with real-time crime updates, the output of which is a forecast capable of predicting hotspots for daily crime.

Damballa Labs utilizes Notos and its derivative output evolutions in a number of ways. For example, we’re able to take any observed DNS record (e.g. domain name and resolved IP address) and provide a real-time score of its reputation – even if this is the first time anyone on the Internet has ever tried to resolve that particular domain name. In practice this means that we can predict (with a scale of confidence) that connecting to a device utilizing that particular domain name (or IP) is malicious (or good) and the nature of the threat it represents – all done through passive means, and without having to have observed the maliciousness directly associated with the device anytime in the past.

Systems like Notos make use of big data (i.e. colossal volumes of historical and streaming data) gathered from a global array of sensors. The mix of historical observations and real-time data feeds means that prediction models can be dynamic enough to keep pace with truly agile threats (and threat operators) – and can yield new approaches in unveiling advanced and sophisticated threats. For example, a possible query could be “provide me a list of domain names that are pointing to residential DSL IP addresses within Villianstan, that have never been looked up by any hosts within the country of Villanstan, that have only been looked up by hosts located within Fortune-100 companies in the USA, and that the number of Fortune-100 companies doing so is less than 5 over the last 12 months.” The result of the query would be a (long) list of domain names that are very high contenders for APT victims, which then drives specialist counter-intelligence analysts and law enforcement to uncover the nature of the threat.

In the meantime I’ll be watching with keen interest the successes of the Santa Cruz Police Department and their new modeling programs. Here at Damballa we’ve had phenomenal success in using machine learning and advanced clustering techniques in unveiling and forecasting new threats.

Friday, August 26, 2011

Practical Packet Analysis Book Review

This week I had the opportunity to read Chris Sanders’ newly released book “Practical Packet Analysis” (second edition) – published by No Starch Press. While I’m not a frequent reader of technical computing books (they’re always a little too bulky for flights and carryon), I was looking for a book I could recommend and pass on to junior security consultants and threat analysts (as well as a few engineers).

Practical Packet Analysis proved to be a good read and I even managed to pick up a few tips on recent features within Wireshark that I’d not previously had a chance to experiment with; but am now looking forward to applying to real-world traffic.

While the book isn’t deeply technical (it’s not meant to be), it performs a very nice walk through of the practical aspects of performing network analysis and investigating packet captures. All too often in the past I’ve encountered network analysis books that either skim through the real-world problems an analyst or engineer will encounter, and rapidly descends in to the weeds of some obscure and contrived examples. Chris manages to navigate these waters is a clear and informative way. The practical analysis examples provide a breadth of understanding of not only the nuances and features of Wireshark, but also the common problems encountered by analysts tasked with troubleshooting their own networks. The sort of things they need to know asap if they’re going to be productive in a minimal amount of time

A chapter I particularly appreciated for its inclusion centered on how and where you should tap a network in order to perform analysis. You wouldn’t believe how many times that chapter alone could have prevented much wasted effort – if only folks had had access to it (and read it).

On the whole, I’d recommend this book to junior network analysts, software developers and newly minted MCSE/CISSP/etc. – folks that just need to roll up their sleeves and get started troubleshooting network (and security) problems. My copy of the book has already been passed on to a third pair of hands for reading and brushing up on the practical application of Wireshark. Great work Chris!

Saturday, August 6, 2011

Not Endgames Again

With the Blackhat and Defcon conferences back to back, the melting pot that is Vegas has served its purpose in bringing together so many of the worlds leading security researchers, consultants, and opinions together. It’s been a tough slog through long days and longer nights, but it’s been so worth it.

While many of the presentations this time round may not have been worthy of previous years conferences, the true value of the event really lies in the hallway discussions and logistical movements between the vendor parties – trading invites for favors, and negotiations over beers pre- and post- party. I know that many folks would agree with me when I say that more business deals are secured and contacts negotiated at the Galleria bar of Caesars Palace than all the other event locations combined.

This year there was a lot of discussion in the Galleria Bar relating to exploit development (a big change from the past decades worth of vulnerability disclosure debate) – mostly due to the media attention garnered by the HB Gary Federal and Endgame Systems (Endgames) disclosures/revelations over recent months.

Each evening I’d inevitably get pulled into (new) discussions as folks I hardly know (or had only just been introduced to) tried to pump me for insider information about Endgames – somehow assuming I’m involved with that company. Let’s be clear – I have nothing to do with the Endgames business! It’s important that people understand that. The fact that both Endgames and Damballa (where I work) are in the same building in Atlanta is a reflection of shared Georgia Tech heritage and talent recruitment - not to mention $$$ per-square-foot office space rental costs – and is not a conspiracy seeking new enlightenment. And No, I don’t (and have never) worked for Endgames.

By way of preempting the next recycled batch of grilling from security nuts, weirdo’s and conspiracy theorists, here are some facts…

Back in 2005 I was enticed to leave NGS Software and London, and assume the role of Director of X-Force in Atlanta after Chris Rouland (the former Director of X-Force – and current CEO of Endgames) took on the role of CTO at Internet Security Systems, after Christopher Klaus (an ISS founder) vacated that particular position. As it happened, I took over responsibility for X-Force just after the Blackhat/Defcon events of 2005 – immediately after the Mike Lynn and Ciscogate (so that wasn’t anything to do with me). So, yes, Chris and I have both held the same titles at ISS and No, Ciscogate was not my fault.
While I was the Director of X-Force, the X-Force group (which consisted of R&D, threat research, detection/protection engineering teams and signature development teams, etc.) reported up through the VP of Engineering. The professional services teams (some of which were/are commonly tagged as “X-Force”) were regionally focused and organized, and so tended to report up through the regional sales organizations (i.e. not my responsibility). This is an important distinction, because ISS wasn’t unfamiliar with some of the professional services that would eventually transfer with the people that kicked off Endgames. So, No, I was not responsible for things labeled as “X-Force” within the professional services division in the US, and Yes, the professional services group(s) did have access at the time to all the latest vulnerabilities and 0-days uncovered by the X-Force R&D teams.
When IBM acquired ISS in October 2006, there were a lot of changes. ISS became IBM ISS and an “Office of the CTO” was established. Given integration challenges and the hope that a center of excellence could be created within IBM to bring together all the great security research done throughout IBM globally – and the hope that the derivative technologies would make it in to products within IBM ISS – the responsibilities for X-force were to be divided and I took on the role of Chief Security Strategist – reporting in to the new “Office of the CTO” – working with Chris Rouland and another founder of Endgames. So, Yes, Chris and I (and several of the eventual founders of EndGames) worked together for a couple of years in the same “office” for IBM ISS.
Some of the (PSS) services ISS had previously provided were not well suited to a company such as IBM and needed to be shutdown or were left to passively wilt while contract renewals wouldn’t be pursued. Several of these services (derivatives and extensions) are directly related to how Endgames came to exist – after the ISS professionals familiar with their delivery and a belief their commercial viability struck out from IBM ISS to create Endgames and satisfy those customer needs. I was never part of that side of the IBM ISS business. For one thing, I’m a foreigner and didn’t have the appropriate security clearances to get involved. For another, I find some aspects of that particular business model unsavory. So, No, I never had a hand in that side of ISS/IBM ISS’ business.
You can’t swing a stick in Atlanta without hitting an ex-ISSer. The number of security professionals that have passed through ISS over the last decade-and-a-half and gone on to establish and populate new security startups in Atlanta is amazing. This is why you’ll find so many ex-ISSer’s working at both Endgames and Damballa – and dozens of other security companies in the area! So, Yes, we all know and respect each other and tend to get on well. Endgames is on the same building one floor below Damballa, and there are several bars within spitting distance of our respective offices.
In the early days of Damballa (which is a startup that sprung out of Georgia Tech), Chris Rouland was on the companies Technical Advisory board. Damballa for it’s first few years of existence was focused on tracking botnets, enumerating the bot infected victims, and providing that insight as commercial intelligence feeds. Shortly after my joining Damballa in 2009, Damballa stopped providing commercial threat intelligence feeds and focused on appliance-based threat detection solutions. Chris Rouland elected to leave the Damballa Technical Advisory Board shortly before Endgames launched their IPTrust brand/service. So, Yes, in the past there was a relationship between Damballa and Chris Rouland (after all, he created the original X-Force and has been a thought leader in the security community for quite some time) – just not what some people have assumed.

There’s probably a whole bunch of additional questions that folks were battering me with this week in Vegas related to Endgames (and HB Gary Federal by proxy) that I couldn’t be bothered answering then, and I’m not going to bother answering now.

There is no commercial relationship between Endgames and Damballa. Damballa and Endgames are separate commercial entities – doing completely different things in totally different ways, with different objectives, customers and employees. The histories of several folks working at both companies are entwined with the history of ISS and IBM ISS – but that’s it.

And so on to the last conspiracy theory questions; No, I know of no cases of ISS selling vulnerabilities to any foreign entities. And, Yes, I’m still an opponent to middle-men financial models relating to the buying and selling 0-day vulnerabilities.

Wednesday, July 13, 2011

Threat Intelligence via Sinkholes

Over the last few months I've been seeing more and more folks pimping botnet victim intelligence feeds. Despite the obvious flaws in these feeds, the subscriptions are going up - despite most folks not really understanding how to use the intelligence.

Just about all the data being sold is harvested from sinkholes - which happens to be a rather crap way of gathering that kind of information. There are all kinds of limitations to the way the intelligence can be employed - especially from a protection perspective.

By way of education, I've pulled together an educational post covering the problems with sinkhole harvested data - from both technology and legal/ethical perspectives.

You can find the posting at the Damballa site - http://blog.damballa.com/?p=1342

Wednesday, April 20, 2011

Oak Ridge National Laboratory Falls for a Spear-Phishing Campaign

An interesting post today on Wired - Top Federal Lab Hacked in Spear-Phishing Attack - details the most recent successful attack against Oak Ridge National Labs.

A couple of the most interesting quotes from the story are:

“The attacker used an Internet Explorer zero-day vulnerability that Microsoft patched on April 12 to breach the lab’s network. The vulnerability, described as a critical remote-code execution vulnerability, allows an attacker to install malware on a user’s machine if he or she visits a malicious web site.”

and...

“The lab began to block the malicious emails soon after they began coming in, but it was already too late. On April 11, administrators discovered a server had been breached when data began leaving the network. Workers cleaned up the infected system, but early Friday evening “a number of other servers suddenly [went] active with the malware,” Zacharia said. The malware had apparently laid dormant for a week before it awoke on those systems. That’s when the lab blocked internet access.”

That's an interesting tactic, and one I haven't seen for a long time. Back in the 2003-2004 era I observed a similar kind of trigger approach being used for targeted attacks against the petrochemical industry (largely associated with organized crime teams that traced back to the Balkans).

Thursday, March 10, 2011

Optimal Methods for Spam and DDoS Offender Discovery

As botnet threats go, Spam and DDoS are probably the most widely known and discussed tactics employed by criminal operators. Despite being some of the last things that career botnet operators employ their compromised victims for, and despite offering the lowest monetization rates for the criminals, DDoS and Spam volume have continued to rise annually.

A question was asked to me recently as to which techniques worked best for dealing with DDoS and Spam participation from within large enterprises or residential DSL/Cable networks – Network Anomaly Detection Systems (NADS) or botnet command-and-control (CnC) enumeration techniques (such as those employed by Damballa)?

It’s not the kind of question that can be answered succinctly. Both approaches are designed to scale to very large networks – and as such are components of a robust protection strategy. In fact the technologies are rather complementary – although I do think that the CnC enumeration approach is more elegant and efficient in the grand scheme of things.

The NADS approach to Spam and DDoS participation detection is simple enough – you monitor netflow (a compact summary of network packet flow – usually to/from IP address, port, protocol, date/time and packet size information), determine a baseline for traffic levels, set alert thresholds for potential anomalies, and define responses when a threshold alert is received. In the context of a simple DDoS threat, you set up a threshold for the volume of HTTP traffic directed at a single destination by a single IP host and label that host as initiating a DDoS attack. If multiple hosts within the network being monitored also reach the HTTP threshold(s) against the same target IP address, you label them all as being part of a DDoS botnet. The same basic principles apply to Spam botnet detection.

An alternative and generally complementary approach to the problem is to automatically identify hosts within the monitored network that are already infected with malware and/or engaged in conversations with botnet CnC servers. This can be achieved in a variety of ways, but one of the simplest ways is to merely observe the DNS requests made by the hosts and the responses from the resolving DNS servers. Having identified suspicious DNS request profiles along with DNS responses that have high probabilities of association with criminal hosting infrastructure, it’s possible to quickly match victims with particular botnets – and label the new (or previously known) CnC fully qualified domain name. Any other hosts exhibiting similar DNS resolution characteristics are members of the same botnet. The beauty of this approach is that this method of detection and botnet enumeration (and labeling) can be done before the botnet victims actually participate in any subsequent Spam or DDoS campaigns.

When it comes to mitigating the threat, the historical way is to effectively block the attack traffic by either firewalling off specific ports or destination IP addresses, or walled gardening the malignant hosts. So, while the botnet host is spewing spam or DDoS traffic, it’s not being routed to its final (target) destination.

That approach may have been OK in the past if you were only dealing with IP-based threat responses and could stomach the voluminous traffic internally, but with more advanced CnC and botnet enumeration technologies you’re able to bring to bear some additional (and more versatile) mitigation techniques. Since you’re constantly identifying and tracking botnet membership and you know which CnC’s these victims are being controlled by, you could perform one or more of the following actions:

As botnet members begin to participate in the DDoS attack or Spam campaign, traffic to and from the CnC server could be blocked. By doing so, no new commands are sent to the botnet victims and they typically cease their attacks. In addition, any other botnet members within the network who have not yet been tasked to participate in the attack will similarly not be able to receive instructions.
Walled Gardens can be selectively initiated around the infected botnet population – blocking just the ports and protocols being used (or likely to be used) in the attack against remote targets – without applying the same blocking to all hosts or subscribers within the network. For example, a botnet may be tasked with DDoSing a popular financial services web portal using a HTTP-based payload. It would therefore be important to only block the attack traffic and allow legitimate traffic through. A walled garden approach could be used in this scenario without having to utilize Deep Packet Inspection (DPI) to differentiate between the attack and legitimate traffic.
The ability to differentiate CnC server activity at the domain name level is important for botnets that utilize fast flux infrastructure to distribute command over large numbers of IP addresses. If recursive DNS services are provided by the organization to their enterprise hosts or subscribers, an alternative DNS response could be sent to the botnet victims – e.g. making botnet.badness.com.cc resolve to localhost (127.0.0.1).
If DPI or PCAP capabilities exist within the organization, they could be selectively deployed to catalog the criminal communications between the botnet members and the CnC server. This detailed evidence of the attack (including the commands being sent by the CnC) can be used for takedown or prosecution purposes.
If the botnet malware agent is relatively unsophisticated or if the CnC server itself is vulnerable to third-party takeover (e.g. a hacked server that the legitimate owner regains control and can now issue commands to the botnet, or if the Botnet CnC portal code contains remotely exploitable vulnerabilities), it may be possible to issue commands “on behalf” of the criminal operator instructing all the botnet members to stop their attack and to automatically uninstall the malware agent.

There are of course many other imaginative ways to use the knowledge of the botnet CnC and its members in preemptive protection strategies too.

I think that NADS-based botnet detection (or more precisely botnet attack traffic detection) is useful for identifying triggers for remediation action – but I think that botnet CnC enumeration techniques can provide greater flexibility in long-term threat management approaches.

GeoIP Irrelevance

GeoIP has traditionally served as a first pass filter for prioritizing the analysis of inbound threats. Over the last few years the value of GeoIP for this purpose has noticeably depreciated and it’s only going to get worse. It’s all relative of course; “worse” doesn’t mean useless, just less valuable in a security context.

At its heart, GeoIP is essentially a mapping between an IP address and some location on a map – and that location may be as specific as a street and postcode, or as broad as a country’s name.

It’s important to note that the various Internet authorities don’t actually administer these IP distribution maps. Unfortunately, there isn’t anything prohibiting (or forcing) IP addresses from being linked to a particular geographical location beyond the registration of netblocks (ranges of contiguous IP addresses) to various entities and where they ultimately choose to host their equipment.

The correlation between IP address and geographical location is left to various organizations (mostly commercial) that have invested in systems making use of a mix of data mining, beaconing and solicitation to obtain actual location information – and this information is bundled up and sold in various consumable formats.

The accuracy of the GeoIP information has always been “variable”. For IP’s associated with large residential ISP’s operating in Western countries – the data is pretty accurate since much of that information has actually been supplied by the subscribers themselves (one way or another – whether they meant to disclose it or not). For IP’s associated with large international organizations – the location data is more often than not meaningless – since it often only reflects the address of the organizations global headquarters rather than the IP’s being used in their various offices and data centers. I’ve found that the more obscure an organization is and the larger their netblock of assigned IP addresses, the less likely GeoIP information will be accurate.

Those artifacts of GeoIP have always been present, but why are things getting worse? There are effectively 3 key aspects as I see it:

You’ve probably heard the news (repeatedly over the last 5 years) that IPv4 IP addresses are running out and just last month the last /8’s were allocated. What this means is that there’s growing pressure to optimize, divide and reassign existing netblock allocations. The result of this is that IP addresses are changing hands – between ISP, organizations, hosting facilities and even countries – at a pace faster than traditional GeoIP service providers can track accurately. This obviously has a catastrophic effect on IP reputation systems too – but I’ll address that issue in a later blog.
The growth of cloud computing, on-demand service provisioning and global balancing of content delivery networks has meant that larger swathes of IP addresses are incorporated into umbrella corporate locations – typically their main data center location. Meanwhile, the organizations utilizing these services may be located anywhere around the world. For example, an organized crime syndicate in Thailand could launch a spear-phishing campaign against Cambodian businesses – sending emails from the US-based Amazon EC2 cloud, and hosting the fraud server within the UK-based ElasticHosts cloud.
There are more service providers offering services that can be easily leveraged for criminal purposes and further obfuscate the true source of an attack – often intentionally (e.g. bullet-proof hosting providers and “privacy protection” services). The trend towards a federated development and provisioning of cybercrime attacks means that the GeoIP information resolves poorly to the generic hosting providers – whose services can be acquired from anywhere around the world. Often the GeoIP data is incorrect – as the service providers have altered or tampered key registration and hosting details.

That all said, GeoIP information is still an incredibly useful first-pass filter for dealing with and prioritizing threat responses.

How can organizations use GeoIP information to supplement their security response?

Most businesses aren’t global and even the global ones don’t necessarily have all offices continuously communicating with all regions of the planet. Create a list of countries or regions that are generally deemed “hostile” and automatically escalate actions based upon observed attacks from that list. As unsavory as it sounds, most organizations can easily compile such a list when pressed – and many will find that simply blocking or dropping traffic to/from those countries will be greatly beneficial. For example, a US-based chain of frozen yogurt stores probably doesn’t need to browse web sites hosted in Somalia and is unlikely to want VPN access attempts initiated from Cypress.
While the bad guys can certainly launch their attacks from “friendly” countries (and even locally) via purchased services or compromised hosts, a sizable percentage of threats encountered on a daily-basis for most organizations do little to hide their source. Therefore, distinguishing between portal login attempts (and failures) initiated from IP addresses based in Beijing China and Atlanta USA can be fruitful in optimizing threat responses.

Of course all bets are off for more sophisticated and targeted threats. But some work effort can be shed through using GeoIP relationship data to filter many criminal and persistent threats.

Nuclear Winter PCAP Repositories

Recently I've been thinking about the catchall approach to security - in particular the absolute-last-stop method of just recording everything on your network and mining it for security events - kind of like surviving a nuclear winter. Here are some additional thoughts...

The other week I spoke at the DoD Cyber Crime Conference here in Atlanta and had a number of questions asked of me relating to the growing number of vendors offering “store it all” network monitoring appliances. That whole approach to network monitoring isn’t an area of security I’ve traditionally given much credence to – not because of the practical limitations of implementing it, nor the inefficiencies and latency of the techniques – but because it’s an inelegant approach to what I think amounts to an incorrectly asked question.

Obviously, given the high concentration of defense and law enforcement attendees that such a conference attracts, there’s an increased emphasis on products that aid evidence gathering and data forensics. The “store it all” angle effectively encompasses devices that passively monitor an organizations network traffic and store it all (every bit and PCAP) on a bunch of disks, tapes or network appliances so that, at sometime in the near future, should someone ever feel the need to or were compelled to, it would be conceptually possible to mine all the stored traffic and forensically unravel a particularly compelling event.

Sounds fantastic! The prospect of having this level of detailed forensic information handy – ready to be tapped at a moment’s notice – is likely verging on orgasmic for many of the “lean forward” incident response folks I’ve encountered over the years.

The “store it all” network monitoring approach is a pretty exhaustive answer to the question “How can I see what happened within my network if I missed it the first time?” But shouldn’t the question be more along the lines of “How can I detect the threat and stop it before the damage is done?”

A “store it all” approach to security is like the ultimate safeguard – no matter what happens, even if my 20 levels of defense-in-depth fail, or someone incorrectly configures system and network logging features (causing events to not be recorded), or if multiple layers of internal threat detection and response systems misbehave, I’d still have a colossal data dump that can eventually be mined. Believe me when I say that I can see some level of comfort in adopting that approach. But the inefficiencies of such a strategy make my eye twitch.

Let’s look at some scoping numbers for consideration. Imagine a medium-sized business with a couple-hundred of employees. Assume for the moment that all those folks, along with several dozen servers, are located at the same building. A typical desktop system has a 1Gbps network interface nowadays, and the networking “backbone” for a network of 250 devices is likely to have a low-end operating capacity of 10Gbps – but let’s assume that the network is only 50% utilized throughout the day. After a little number crunching, if you were to be capturing all that network activity and seeking to store it, you’d be amassing 54TB of data every day – so, perhaps you don’t want to capture everything after all?

How about reducing the scale of the problem and focusing upon just the data going to and from the Internet via a single egress point? Let’s assume that the organization only has a 10Mbps link to their ISP that’s averaging 75% utilization throughout the day. After a little number crunching, you’ll arrive at a wholesome 81GB of data per day. That’s much more manageable and, since a $50k “store it all” appliance will typically hold a couple of Terabytes of data without too many problems, you’d be able to retain a little over three weeks of network visibility.

How does this help your security though? Storing the data isn’t helping on a protection front (neither preemptive nor reactive), and it’s not going to help identify any additional threats you may have missed unless you’re also investing in the tools and human resources to sift through all the data.

To use an analogy, you’re a farmer and you’ve just invested in a colossal hay barn, you’ve acquired the equipment to harvest and bundle the hay, and you’re mowing fields that are capable of growing more hay than you could ever seek to perpetually store. Then someone informs you that one of their cows died because it swallowed a nail that probably came from your hay – so you’d better run through all those hay bales stored in your barn and search for any other nails that could kill someone else’s cow. The fact that the cow that died ate from a hay bale that’s no longer stored in your (full) barn is unfortunate I guess. But anyway, you’re in a reactive situation and you’ll remain in a reactive phase no matter how big your barn eventually becomes.

If you’ve got a suspicion that metal objects (nails, needles, coins, etc.) are likely to be bad juju, shouldn’t you be seeking them out before you’ve gone to all the work of filling your barn with hay bales? Wouldn’t it make more sense to perhaps use a magnet and detect those metal objects at the time you’re cutting the hay – before you’re putting it in a bale, and before you put those bales in your barn? Even if you had no forethought that metal objects in your hay could cause eventually a problem, do you persist with a strategy of periodically hunting for the classic “needle in a haystack” in your barn despite now knowing of the threat?

Getting back to the world of IT security and threat detection (and mitigation)… I’ve found that there are greater efficiencies in identifying threats as the network data is streaming by – rather than reactive post-event data-mining approaches.

I guess I’ll hear some folks ask “what about the stuff they might miss?” There are very few organizations that I can think of able to employ the skills and resources needed to analyze the “store it all” network traffic at a level even remotely comparable to what a security product vendor already includes in their commercial detection offerings – and those vendors are typically doing their analysis in a streaming fashion (and usually with something more sophisticated than magnets).

My advice to organizations looking at adopting “store it all” network monitoring appliances is the following:

If you already have all of your protection and detection bases completely covered, maybe deploying these appliances makes sense – provided you employ the dedicated security analysts and incident response folks to make use of the data.
Do you know what you’re trying to protect? “Store it all” approaches are designed to fill in the gaps of your other threat monitoring and detection systems. Is the threat going to be present at the network egress point, or will you need to store traffic from other (higher-volume) network segments? If so, be cognizant of how far back you can roll your eventual analysis.
If you’re in to hording data for the purpose of forensics and incident response, a more efficient and cost effective approach may be to turn on (and optimize) your logging capabilities. Host logging combined with network logging will yield a very rich data set (and will often be richer than simply storing all network traffic) which can be mined much more efficiently.
If host-based logging isn’t possible or is proving to be too unwieldy, and you find yourself having to maintain a high paranoia state throughout the organization, you may want to consider implementing a flow-based security approach and invest in a network anomaly detection system. That way you’ll get near real-time alerting for bespoke threat categories – rather than labor-intensive reactive data-mining.
If you have money to burn, buy the technology and begin storing all the PCAP data you can. Although I’d probably opt for a Ferrari purchase myself…

Wednesday, February 23, 2011

Threatology

Just a recap on some thinking covering threats and the folks who study them...

One of the key principles to understanding the threat is having the ability to monitor it. Within an enterprise environment security teams instrument the network in the form of protection technologies and stream alerts back to management consoles or aggregate multiple alert streams into centralized SIEM’s (or equivalent). Without sounding too depreciating, as difficult as it is to monitor threats within an enterprise, it’s nothing like monitoring Internet bound threats.

I know that plenty of organizations profess to monitoring threats as they propagate the Internet – often providing threat feeds to caring organizations (typically for a fee), or incorporating the processed threat data into tools and technologies behind the scene. The problem is that much of this monitoring is based upon point sampling and is heavily biased to the organizations geographic presence – and that’s before we get into the technical aspects of the monitoring systems in play.

In very basic terms you could think of it a bit like radio. Geographical distance and topology affect our ability to listen to a particular radio channel. The type of radio set and the frequency range it is capable of intercepting (e.g. AM, FM and shortwave) dictate the overall “richness” and quality of what we’re listening too. The mix of just these few simple variables greatly affects our globe-spanning listening pleasure. Even then, given a top-of-the-range radio placed on the highest mountain with the clearest “line of sight” in the world, reception capability is still limited and it probably isn’t going to interpret digital terrestrial TV signals.

Understanding the threats that plague the Internet and infiltrate the enterprise network is more than just instrumentation and regular mechanical sampling. To grasp the threat you need to understand the limitations of your threat visibility, constantly upgrade and extend the monitoring systems, and finally augment that visibility with data analysis systems capable of managing, correlating and analyzing huge volumes of streaming data. Even then there’s still a high degree of “art” to interpreting the nature of an Internet-spanning threat.

To my mind the methods, skills and acumen to understanding and tracking Internet threats are eerily similar to meteorology. Perhaps I’m biased – I specialized in Atmospheric Physics at University after all – but those skills and experiences I gained in meteorology can increasingly be applied to studying Internet threats. In particular, those of forecasting and dealing with abrupt changes of chaotic systems.

Let me propose the concept of Threatology – the study and analysis of Internet threats – and the Threatologists who study and understand it. Much of threatology is still an art – but that’s OK. Sure, there are millions of sensors scattered around the Internet (in the form of IDS sensors, AV sensors, web crawlers, spam traps, etc.) feeding data back to the threatologists for analysis – just as there are rain gauges, barometers, thermometers, anemometers and Doppler radar, etc. feeding data to meteorologists – but the real work goes into feeding the big modeling systems designed to digest the streaming data and forecasting what’ll happen next.

Today’s threatologists are still battling the intricacies and limitations of the sensors they’ve deployed (or have access to) around the Internet. Take for example the data feeds gained from the tens-of-millions of deployed desktop anti-virus products out there that phone-home with the latest things their subscribers have been infected with. An analogy would be the millions of amateur meteorologists submitting their latest rain gauge data back to the national meteorology department. Intricacies such as make and manufacturer of the gauge (affecting what’s actually being measured), physical location (e.g. under a tree or patio, or in the middle of a one-acre yard), geographical location (95% located in suburbia, 3% in farms, etc.), cleaning regime (the sensor’s full of autumn leaves or mud) and technical skill of the amateur operator – greatly limit the usefulness of this “invaluable” data source.

Over the last five decades meteorologists have employed ever-more advanced weather modeling systems that take in all this sensor data, apply historical trends and prediction routines, and manage to provide fairly accurate forecasts a few days out into the future. Threatologists meanwhile only have a couple of years playing with their own threatology modeling systems – and there’s a long way to go. There’s a lot to be learned from meteorology and the tools that have been developed thus far. Sure, there are many differences in the specifics of the data and nature of the threat – but the dynamic and chaotic characteristics (using the mathematical definition) of the threat are things that have already been “solved” in meteorology.

Welcome to the era of threatology and the professional threatologists.

Reinventing the Sandpit

Sometimes it feels that the IT security world loves innovation as much as it loves to reinvent the wheel – particularly when it comes to wrapping sheets of tin around a previously established security technology and labeling it as advancement. The last few weeks have been no exception in the run up to the annual RSA conference in San Francisco and the recent “innovation” going on in dealing with next generation malware (or AV+ as some folks refer to it) as multiple vendors launch new appliances to augment their product portfolio.

The latest security technology to undergo the transformation to tin is of course the automated analysis of suspicious binaries using various sandboxing techniques. For those of you not completely familiar with sandboxing, a sandbox is effectively a small self-contained version of an computer environment offering a minimal suite of services and capabilities. As the name applies, a sandbox serves as a safe environment for running various applications that may be destructive in other circumstances – yet can be rapidly built up and torn down as necessary.

In an enterprise security context, sandboxes are regularly encountered in two operational security implementations – safe browser sandboxes (designed to wrap around the web browser and protect the operating system from any maliciousness that may occur while the user is browsing the web and prevent attacks from contaminating the base operating system) and gateway binary introspection (i.e. the automatic duplication or interception of suspicious binary files which are then executed within a sandbox that mimics a common operating system configuration for the purpose of identifying and classifying any malicious binaries they come across).

The sandbox approach to malware identification is often referred to as signature-less and offers many advantages over classic anti-virus technologies, but they also suffer from their own unique set of limitations and inconveniences – most have to do with the way in which malware can discover that it is being executed within a sandboxed environment and thus act benignly, and limitations to the faithfulness with which the sandbox imitates a genuine targeted system (e.g. installed applications, application version, Internet connectivity, etc.). In general though, sandbox approaches to automated malware inspection and classification are more sophisticated and accurate than signature-based anti-virus approaches.

Despite what you may have heard in the flurry of newly released AV+ solutions, automated malware sandbox approaches aren’t precisely new – in fact they’ve had over a decade of operational and, dare I say it, “hostile” use. For example, Damballa has been operating sandboxing technology in the cloud pretty much since the inception of the company. We’ve chosen to use multiple sandbox technologies (along with bare-metal systems, manual analysis, etc.) to automatically process the mountains of new malware captured every day to mechanically extract their network characteristics, automatically cluster new malware families, and provide attribution to multiple criminal organizations.

Note that, from a product perspective, Damballa doesn’t run malware sandboxing technology from within a customer’s environment – there’s little to be gained from doing so, and the risks greatly outweigh the possible gain. Instead, the automated analysis of suspicious and vetted binaries using cloud-based malware enumeration technologies (which includes very sophisticated sandbox approaches amongst other specialized malware dissection engines) has proven to be more accurate, efficient and secure.

Over the years, many different malware analysis sandbox technologies have been developed. For example (not a complete list):

Norman Sandbox (2001) – In 2001 Norman presents its sandbox technology for the first time at the Virus Bulletin conference in Prague and offers a commercial sandbox version in 2003.

CWSandbox (2007) – Originally created by researchers from University of Mannheim. Available commercially by GFI Software (formerly Sunbelt Software) and free/academic use via http://mwanalysis.org

Sandboxie (2006)

Anubis (2006)

Joebox (2007)

Azure (2008)

BitBlaze (2008)

ThreatExpert (2008)

Ether (2009)

Each sandbox technology tends to be implemented in a different way – usually optimized and tuned for specific classes of malware (or aspects of malware) – and typically utilize either an emulator or virtual-machine approach. Emulators tend to be much smaller and faster in analyzing specific classes of malware, but suffer from their greatly limited range of supported (i.e. emulated) operating system API’s. Virtual machine approaches tend to be much more flexible, but are larger and slower.
Over the last decade, virtual machine (VM) based approaches have risen to the fore for automated sandbox approaches to malware investigation. The VM approach allows multiple guest OS images to be loaded simultaneously in order to run the malware within a self-contained and disposable environment. Interestingly enough, as a side note, did you know that the concept of running multiple, different operating systems on a single computer system harkens back to the 1970’s following research by IBM and the availability of the IBM VM/370 system? Talk about coming a full circle with “what’s old is new” again in security.

For sandboxing technologies, a combination of API hooking and/or API virtualization is often used to analyze and classify the malware. A term you will often see is “instruction tracing” – which refers to the observations recorded by the sandbox technology which are eventually used to derive the nature of the binary sample under investigation. This instruction tracing lies at the heart of sandbox-based approaches to automated malware analysis – and is the Achilles heel exploited by evasive malware.

Instruction tracing is typically implemented in one or more of the following ways:

User-mode agent – a software component is installed within the guest operating system and reports all user-based activity to the trace handler (think of this kind of like a keylogger).

Kernel-mode Patching – The kernel of the guest operating system is modified to accommodate tracing requirements (think of this kind of like a rootkit).

Virtual machine monitoring – The virtual machine is modified and instrumented itself to observe the activities of the guest operating system

System emulation – A hardware emulator is modified to hook appropriate memory, disk IO functions and peripherals (etc.) and report activities (think of this as a hall of mirrors approach). Emulation approaches are great for more difficult operating systems (e.g. Android, SCADA systems, etc.)

Unfortunately each of these sandboxing techniques exhibit system characteristics that can be detected by the malware being analyzed and, depending upon the nature of the malware, can be used programmatically to avoid detection.

Despite all these limitations, the sandbox approach to malware analysis has historically proven to be useful in analyzing the bulk of everyday malware.
In more recent years the techniques have become less reliable as malware developers have refined their sandbox detection methods and evolved more subtle evasion techniques. Many of these detection techniques are actually independent of the sandboxing technique being used – for example, the multitude of network-based discovery and evasion techniques discussed in my previous whitepaper “Automated In-Network Malware Analysis”.

The sandbox approach to automated malware identification and classification needs to be backed up with more advanced and complementary malware detection technologies. Organizations facing the brunt of targeted attacks and advanced persistent threats should make sure that they have access to sandbox analysis engines within their back office for the bulk processing of malware samples (running multiple configurations of the standard desktop OS builds (or gold images) deployed within the organization), and include a mix of bare-metal and honey-pot systems to handle the more insidious binary files. Even then, executing malware within your own organizations network or physical location is risky business for the reasons I covered in an earlier blog on the topic – you’re “damned if you do, and damned if you don’t”.
If you’re going to go to all the effort of installing and maintaining malware analysis sandboxes within your own organization, my advice is to look beyond the latest installment of tin-wrapped hype and take a closer look at the more established sandbox technologies out there. There’s plenty of choice – and many are free.

Post-emptive Detection

In the week before RSA I managed to pull together a blog on the Damballa site covering several of the problems with approaches that focus upon storing "all" the data and (eventually) data mining it in the quest for security alerts - aka Store it all in my barn. Here's what I had to say...

My advice to organizations looking at adopting “store it all” network monitoring appliances is the following:

If you already have all of your protection and detection bases completely covered, maybe deploying these appliances makes sense – provided you employ the dedicated security analysts and incident response folks to make use of the data.
Do you know what you’re trying to protect? “Store it all” approaches are designed to fill in the gaps of your other threat monitoring and detection systems. Is the threat going to be present at the network egress point, or will you need to store traffic from other (higher-volume) network segments? If so, be cognizant of how far back you can roll your eventual analysis.
If you’re in to hording data for the purpose of forensics and incident response, a more efficient and cost effective approach may be to turn on (and optimize) your logging capabilities. Host logging combined with network logging will yield a very rich data set (and will often be richer than simply storing all network traffic) which can be mined much more efficiently.
If host-based logging isn’t possible or is proving to be too unwieldy, and you find yourself having to maintain a high paranoia state throughout the organization, you may want to consider implementing a flow-based security approach and invest in a network anomaly detection system. That way you’ll get near real-time alerting for bespoke threat categories – rather than labor-intensive reactive data-mining.
If you have money to burn, buy the technology and begin storing all the PCAP data you can. Although I’d probably opt for a Ferrari purchase myself…