Wednesday, February 23, 2011

Reinventing the Sandpit

Sometimes it feels that the IT security world loves innovation as much as it loves to reinvent the wheel – particularly when it comes to wrapping sheets of tin around a previously established security technology and labeling it as advancement. The last few weeks have been no exception in the run up to the annual RSA conference in San Francisco and the recent “innovation” going on in dealing with next generation malware (or AV+ as some folks refer to it) as multiple vendors launch new appliances to augment their product portfolio.

The latest security technology to undergo the transformation to tin is of course the automated analysis of suspicious binaries using various sandboxing techniques. For those of you not completely familiar with sandboxing, a sandbox is effectively a small self-contained version of an computer environment offering a minimal suite of services and capabilities. As the name applies, a sandbox serves as a safe environment for running various applications that may be destructive in other circumstances – yet can be rapidly built up and torn down as necessary.

In an enterprise security context, sandboxes are regularly encountered in two operational security implementations – safe browser sandboxes (designed to wrap around the web browser and protect the operating system from any maliciousness that may occur while the user is browsing the web and prevent attacks from contaminating the base operating system) and gateway binary introspection (i.e. the automatic duplication or interception of suspicious binary files which are then executed within a sandbox that mimics a common operating system configuration for the purpose of identifying and classifying any malicious binaries they come across).

The sandbox approach to malware identification is often referred to as signature-less and offers many advantages over classic anti-virus technologies, but they also suffer from their own unique set of limitations and inconveniences – most have to do with the way in which malware can discover that it is being executed within a sandboxed environment and thus act benignly, and limitations to the faithfulness with which the sandbox imitates a genuine targeted system (e.g. installed applications, application version, Internet connectivity, etc.). In general though, sandbox approaches to automated malware inspection and classification are more sophisticated and accurate than signature-based anti-virus approaches.

Despite what you may have heard in the flurry of newly released AV+ solutions, automated malware sandbox approaches aren’t precisely new – in fact they’ve had over a decade of operational and, dare I say it, “hostile” use. For example, Damballa has been operating sandboxing technology in the cloud pretty much since the inception of the company. We’ve chosen to use multiple sandbox technologies (along with bare-metal systems, manual analysis, etc.) to automatically process the mountains of new malware captured every day to mechanically extract their network characteristics, automatically cluster new malware families, and provide attribution to multiple criminal organizations.

Note that, from a product perspective, Damballa doesn’t run malware sandboxing technology from within a customer’s environment – there’s little to be gained from doing so, and the risks greatly outweigh the possible gain. Instead, the automated analysis of suspicious and vetted binaries using cloud-based malware enumeration technologies (which includes very sophisticated sandbox approaches amongst other specialized malware dissection engines) has proven to be more accurate, efficient and secure.

Over the years, many different malware analysis sandbox technologies have been developed. For example (not a complete list):

  • Norman Sandbox (2001) – In 2001 Norman presents its sandbox technology for the first time at the Virus Bulletin conference in Prague and offers a commercial sandbox version in 2003.
  • CWSandbox (2007) – Originally created by researchers from University of Mannheim. Available commercially by GFI Software (formerly Sunbelt Software) and free/academic use via http://mwanalysis.org
  • Sandboxie (2006)
  • Anubis (2006)
  • Joebox (2007)
  • Azure (2008)
  • BitBlaze (2008)
  • ThreatExpert (2008)
  • Ether (2009)
  • Each sandbox technology tends to be implemented in a different way – usually optimized and tuned for specific classes of malware (or aspects of malware) – and typically utilize either an emulator or virtual-machine approach. Emulators tend to be much smaller and faster in analyzing specific classes of malware, but suffer from their greatly limited range of supported (i.e. emulated) operating system API’s. Virtual machine approaches tend to be much more flexible, but are larger and slower.
    Over the last decade, virtual machine (VM) based approaches have risen to the fore for automated sandbox approaches to malware investigation. The VM approach allows multiple guest OS images to be loaded simultaneously in order to run the malware within a self-contained and disposable environment. Interestingly enough, as a side note, did you know that the concept of running multiple, different operating systems on a single computer system harkens back to the 1970’s following research by IBM and the availability of the IBM VM/370 system? Talk about coming a full circle with “what’s old is new” again in security.

    For sandboxing technologies, a combination of API hooking and/or API virtualization is often used to analyze and classify the malware. A term you will often see is “instruction tracing” – which refers to the observations recorded by the sandbox technology which are eventually used to derive the nature of the binary sample under investigation. This instruction tracing lies at the heart of sandbox-based approaches to automated malware analysis – and is the Achilles heel exploited by evasive malware.

    Instruction tracing is typically implemented in one or more of the following ways:

  • User-mode agent – a software component is installed within the guest operating system and reports all user-based activity to the trace handler (think of this kind of like a keylogger).
  • Kernel-mode Patching – The kernel of the guest operating system is modified to accommodate tracing requirements (think of this kind of like a rootkit).
  • Virtual machine monitoring – The virtual machine is modified and instrumented itself to observe the activities of the guest operating system
  • System emulation – A hardware emulator is modified to hook appropriate memory, disk IO functions and peripherals (etc.) and report activities (think of this as a hall of mirrors approach). Emulation approaches are great for more difficult operating systems (e.g. Android, SCADA systems, etc.)
  • Unfortunately each of these sandboxing techniques exhibit system characteristics that can be detected by the malware being analyzed and, depending upon the nature of the malware, can be used programmatically to avoid detection.

    Despite all these limitations, the sandbox approach to malware analysis has historically proven to be useful in analyzing the bulk of everyday malware.
    In more recent years the techniques have become less reliable as malware developers have refined their sandbox detection methods and evolved more subtle evasion techniques. Many of these detection techniques are actually independent of the sandboxing technique being used – for example, the multitude of network-based discovery and evasion techniques discussed in my previous whitepaper “Automated In-Network Malware Analysis”.

    The sandbox approach to automated malware identification and classification needs to be backed up with more advanced and complementary malware detection technologies. Organizations facing the brunt of targeted attacks and advanced persistent threats should make sure that they have access to sandbox analysis engines within their back office for the bulk processing of malware samples (running multiple configurations of the standard desktop OS builds (or gold images) deployed within the organization), and include a mix of bare-metal and honey-pot systems to handle the more insidious binary files. Even then, executing malware within your own organizations network or physical location is risky business for the reasons I covered in an earlier blog on the topic – you’re “damned if you do, and damned if you don’t”.
    If you’re going to go to all the effort of installing and maintaining malware analysis sandboxes within your own organization, my advice is to look beyond the latest installment of tin-wrapped hype and take a closer look at the more established sandbox technologies out there. There’s plenty of choice – and many are free.

    No comments:

    Post a Comment