Technicalinfo.net Blog: threat detection

Thursday, September 8, 2022

It’s Time for Security Analytics to Embrace the Age of Science Over Art

Security analytics has traditionally been approached with a “hunt and peck” mentality, which has made the process of uncovering and responding to cyberthreats more art than science. A human analyst has an idea of what they are looking for when they begin to hunt across the available data, performing that task based on their own experience. They’ve been taught to celebrate when they find something, and that the trickier and more obscure the discovery, the greater the celebration of their skills.

This situation is, I believe, an “art” because the results will always differ between analysts — the day of the week, what they had for breakfast, or how their weekend went — and there are too many outside factors that can affect the individual doing the hunting. The situation has only been perpetuated by an industry that has for too long touted the value of this “art.”

We’re no longer working with a simple canvas

We’ve all heard it before and will continue to hear it — data volumes and the enterprise landscape have been growing exponentially and that’s not going to stop. This was put into hyperdrive with the rapid adoption of cloud computing, which challenges organizations to collect and analyze complete data from multiple sources, including new cloud data as well as from legacy, on-premises infrastructures. This has resulted in limited visibility that ultimately compromises overall security.

What we’re not hearing enough is that applying to this challenge the long-held belief in the “art” of hunt-and-peck doesn’t scale and isn’t a reliable or repeatable process that can come close to meeting the needs of modern enterprise environments.

Managing haystacks of needles

We all know the saying “finding a needle in a haystack.” But in today’s threat landscape, given the data volumes with which analysts are burdened, it’s more like finding the sharpest needle in a haystack of needles. Following the decades-old mantra of “assume breach,” we need to turn our focus to the threats that matter most — the sharpest needles. This requires operationalizing the hunt, triage, investigation and response by removing humans from being “artistic” speed bumps and instead empowering them with the science of protection embedded in security analytics.

Adopting the science of security analytics that leverages automation built on machine learning and AI enables repeatable, reliable streaming investigations of threats across all the data, at all times. Applying this method will reveal orders of magnitude more threats and incidents — across a broad spectrum of risk — occurring continuously within the enterprise. We’ve reached the tipping point where threat volumes have far exceeded what any number of human analysts could reasonably hunt/triage, let alone respond to. This means enterprise security teams must increasingly apply AI and ML to the management of the threats they find (i.e., managing those stacks of needles) as well as the mitigations and responses.

Reprieve begins with automation

Building processes that are autonomous is the critical element to embracing a scientific approach to protection. While past security solutions focused on automation, they were largely unsuccessful due to inflexibility and reliance upon humans to choose the right automation steps in advance of applying them for every exception. This is not the role people should be playing when it comes to successfully implementing autonomous solutions, and it doesn’t do anything to lighten their load. Instead, autonomous solutions should deploy system “smartness” to fill in the blanks and know to ask for human guidance when it’s actually needed.

If we continue with the mantra of “assume breach,” and operationalize security as described above, we also must completely rethink the human-focused SOC solution of filtering alerts. With people having been swamped to the point of (and beyond) alert fatigue, the solution has been to drastically manage the funnel of events and alerts, thus reducing the aperture of enterprise threat visibility and response — none of which sounds like a solution to me.

It begs the question: Why bother collecting alerts and events in the first place if you’re only going to do something with 1% of the top 1% most critical alerts? My response: Filtering is the worst way to manage security.

Instead, let’s do this:

With modern AI and autonomous hunting and triaging solutions, the system can look at every event and alert as it streams by and correlate, question and enrich them all in real time — all the time. The more data collected the more accurate and useful the autonomous system becomes, improving its ability to identify the collective stories and present them to the business and the analysts. To take it a step further, the autonomous system can then, in most cases, perform autonomous responses to the threats being found.

Human and machine harmony

Anytime automation in security is discussed it brings up the fear of automating away the analyst. But with a science-first approach, they aren’t going anywhere. The human analyst role is transforming, which will be a huge benefit to the people who work in SOCs. By adopting a scientific method for security analytics, the analyst will influence and guide the autonomous system to ensure it delivers business impact and value:

For exceptions when the AI doesn’t have enough information or confidence to provide an autonomous response, it watches and learns how the human analyst does or did it, thus building and establishing a scientific methodology.
At the cloud-SaaS level, those learnings may come from hundreds of enterprise SOC teams and thousands of expert security analysts, from which the AI systems can take collective intelligence and apply those learnings and methodology refinements back into the hands of the individual analyst.

The final result? The loop gets closed. The analyst is augmented.

The autonomous system deals with the daily grind, identifies the gaps that require human expertise, learns by watching how humans fill in the methodology gaps, and reapplies those learnings collectively. For instance, assume that a security team is capable of performing 100 manual investigations per day. An autonomous system could ask millions of forensic questions in a day. Time to resolution is shortened by augmenting the work the analyst does. The autonomous system performs repetitive, data-intensive work, it can quickly go back in time and ask an infinite number of questions, and the efficiency benefits just go on and on.

Leading with science will equip security analysts with actionable data across use cases ranging from threat detection, threat investigation, and threat hunting to ransomware investigation and incident response. It helps security teams work smarter and respond faster while boosting productivity and strengthening security.

-- Gunter Ollmann

First Published: Medium - September 8, 2022

Friday, January 13, 2017

Machine Learning Approaches to Anomaly and Behavioral Threat Detection

Anomaly detection approaches to threat detection have traditionally struggled to make good on the efficacy claims of vendors once deployed in real environments. Rarely have the vendors lied about their products capability – rather, the examples and stats they provide are typically for contrived and isolated attack instances; not representative of a deployment in a noisy and unsanitary environment.

Where anomaly detection approaches have fallen flat and cast them in a negative value context is primarily due to alert overload and “false positives”. False Positive deserves to be in quotations because (in almost every real-network deployment) the anomaly detection capability is working and alerting correctly – however the anomalies that are being reported often have no security context and are unactionable.

Tuning is a critical component to extracting value from anomaly detection systems. While “base-lining” sounds rather dated, it is a rather important operational component to success. Most false positives and nuisance alerts are directly attributable to missing or poor base-lining procedures that would have tuned the system to the environment it had been tasked to spot anomalies in.

Assuming an anomaly detection system has been successfully tuned to an environment, there is still a gap on actionability that needs to be closed. An anomaly is just an anomaly after all.
Closure of that gap is typically achieved by grouping, clustering, or associating multiple anomalies together in to a labeled behavior. These behaviors in turn can then be classified in terms of risk.

While anomaly detection systems dissect network traffic or application hooks and memory calls using statistical feature identification methods, the advance to behavioral anomaly detection systems requires the use of a broader mix of statistical features, meta-data extraction, event correlation, and even more base-line tuning.

Because behavioral threat detection systems require training and labeled detection categories (i.e. threat alert types), they too suffer many of the same operational ill effects of anomaly detection systems. Tuned too tightly, they are less capable of detecting threats than an off-the-shelf intrusion detection system (network NIDS or host HIDS). Tuned to loosely, then they generate unactionable alerts more consistent with a classic anomaly detection system.

The middle ground has historically been difficult to achieve. Which anomalies are the meaningful ones from a threat detection perspective?

Inclusion of machine learning tooling in to the anomaly and behavioral detection space appears to be highly successful in closing the gap.

What machine learning brings to the table is the ability to observe and collect all anomalies in real-time, make associations to both known (i.e. trained and labeled) and unknown or unclassified behaviors, and to provide “guesses” on actions based upon how an organization’s threat response or helpdesk (or DevOps, or incident response, or network operations) team has responded in the past.

Such systems still require baselining, but are expected to dynamically reconstruct baselines as it learns over time how the human operators respond to the “threats” it detects and alerts upon.
Machine learning approaches to anomaly and behavioral threat detection (ABTD) provide a number of benefits over older statistical-based approaches:

A dynamic baseline ensures that as new systems, applications, or operators are added to the environment they are “learned” without manual intervention or superfluous alerting.
More complex relationships between anomalies and behaviors can be observed and eventually classified; thereby extending the range of labeled threats that can be correctly classified, have risk scores assigned, and prioritized for remediation for the correct human operator.
Observations of human responses to generated alerts can be harnesses to automatically reevaluate risk and prioritization over detection and events. For example, three behavioral alerts are generated associated with different aspects of an observed threat (e.g. external C&C activity, lateral SQL port probing, and high-speed data exfiltration). The human operator associates and remediates them together and uses the label “malware-based database hack”. The system now learns that clusters of similar behaviors and sequencing are likely to classified and remediated the same way – therefore in future alerts the system can assign a risk and probability to the new labeled threat.
Outlier events can be understood in the context of typical network or host operations – even if no “threat” has been detected. Such capabilities prove valuable in monitoring the overall “health” of the environment being monitored. As helpdesk and operational (non-security) staff leverage the ABTD system, it also learns to classify and prioritize more complex sanitation events and issues (which may be impeding the performance of the observed systems or indicate a pending failure).

It is anticipated that use of these newest generation machine learning approaches to anomaly and behavioral threat detection will not only reduce the noise associated with real-time observations of complex enterprise systems and networks, but also cause security to be further embedded and operationalized as part of standard support tasks – down to the helpdesk level.

-- Gunter Ollmann, Founder/Principal @ Ablative Security

(first published January 13th - "From Anomaly, to Behavior, and on to Learning Systems")