Showing posts with label SIEM. Show all posts
Showing posts with label SIEM. Show all posts

Tuesday, October 15, 2024

Getting Your SOC SOARing Despite AI

It’s a fact: enterprise security operations centers (SOCs) that are most satisfied with their investments in Security Information and Event Management (SIEM) and Security Orchestration, Automation, and Response (SOAR) operate and maintain less than a dozen playbooks. This is something I’ve uncovered in recent years whilst building SIEM+SOAR and autonomous SOC solutions – and it perhaps runs counterintuitive to many security leaders’ visions for SOAR use and value.

SOAR technology is one of those much-touted security silver bullets that have tarnished over time and been subsumed into broader categories of threat detection and incident response (TDIR) solutions, yet it continues to remain a distinct must-have modern SOC capability.


Why do satisfied SOCs run so few playbooks? After all, a core premise of SOAR was that you can automate any (perhaps all) security responses, and your SOC analysts would spend less time on ambient security noise – giving them more time to focus on higher-priority incidents. Surely “success” would include automating as many responses as possible?

Beyond the fact that “security is hard,” the reality is that threat detection and response is as dynamic as the organization you’re trying to protect. New systems, new policies, new business owners, new tools, and a sea of changing security products and API connector updates mean that playbooks must be dynamic and vigilantly maintained, or they become stale, broken, and ineffective.

Every SOC team has at least one playbook covering their phishing response. It’s one of the most common and frequently encountered threats within the enterprise, yet “phishing” covers an amazingly broad range of threats and possible responses, so a playbook-based response to the threat is programmatically very complex and brittle to environmental changes.

From a SOC perspective of automating and orchestrating a response, you would either build a lengthy single if/then/else-stye playbook or craft individual playbooks for each permutation of the threat. Smart SOC operators quickly learn that the former is more maintainable and scalable than the latter. A consequence of this is that you need analysts with more experience to maintain and operate the playbook. Any analyst can knock-up a playbook for a simple or infrequently encountered threat vector, but it takes business knowledge and vigilance to maintain each playbook’s efficacy beyond the short term.

Surely AI and a sprinkle of LLM magic will save the day though, right?

I can’t count the number of security vendors and startups that have materialized over the last couple of years with AI and LLM SOAR capabilities, features, or solutions – all with pitches that suggest playbooks are dead, dying, replaced, accelerated, automatically maintained, dynamically created, managed, open sourced, etc., so the human SOC analyst does less work going forward. I remain hopeful that 10% of any of that eventually becomes true.

For the immediate future, SOC teams should continue to be wary of any AI stories that emphasize making it easier to create playbooks (or their product-specific equivalent of a playbook). More is NOT better. It’s too easy (and all too common) to fall down the rathole of creating a new playbook for an existing threat because it’s too hard to find and maintain an earlier iteration of that threat’s playbook. Instead, focus on a subset of the most common and time-consuming threats that your SOC already faces daily, and nail them using the smallest number of playbooks you can get away with.

With the Rolling Stones’ “(I Can’t Get No) Satisfaction” playing in the background, perhaps you’ll get some modicum of SIEM+SOAR satisfaction by keeping your playbook playlist under a dozen.

-- Gunter Ollmann

First Published: IOActive Blog - October 15, 2024

Tuesday, March 23, 2021

The Cusp of a Virtual Analyst Revolution

Security Analytics and Threat Investigation Are in the Midst of a Sea Change

Once live stomping around vendor-packed expo halls at security conferences returns, it is highly probable that “Virtual Analyst” will play a starring role in buzzword bingo. Today, the loosely defined term represents an aspiration for security vendors and managed service providers but may be perceived as a threat by internal day-to-day security operations and threat hunting teams.

For context, security analytics and threat investigation are in the midst of a sea change. Cloud log analytics platforms now enable efficient and timely analysis of ever-increasing swathes of enterprise logs, events, and alerts dating back years. Threat Intelligence platforms are deeply integrated into cloud SIEM solutions—enabling both reactive and proactive threat hunting and automated incident investigation—and are entwined with a growing stack of sophisticated AI and ML capabilities. However, smart event correlation and alert fusion engines automatically triage the daily deluge of suspiciousness down to a manageable stack of high-priority incidents—replete with kill-chain reassembly and data enrichment.


In many environments the traditional tier-one security analyst responsibilities for triaging events (removing false positives and “don’t care” noise) and maintaining operational health of scale-limiting SOC systems (e.g., device connectors, log retention and storage parameters, ticket response management) have already been subsumed by modern SIEM solutions. Meanwhile, platform-native no-code/low-code-powered orchestration and automation capabilities, along with growing libraries of community-sourced investigation and response playbooks, have greatly accelerated incident response and efficacy for tier-two analysts—alleviating time-consuming repetitive tasks and increasing focus on new and novel incidents.

Arguably, the Virtual Analyst is already here—captured within the intelligent automation and efficiencies of modern cloud SIEM— and I believe the journey has just begun.

The near future evolution of the Virtual Analyst is being driven by two competing and intwined motions —the growing need for real-time threat response, and the inaccessibility of deep security knowledge and expertise.

Real-time threat response has long been thought an achievable target for in-house security operations teams and has underpinned many historic CISO security purchasing decisions. As the enterprise attack surface has grown, adversaries (external and internal) have increased the breadth and pace of attack, and in response businesses continue to invest heavily in instrumenting their environments with an “assume breach” mindset—widening the visibility aperture and exponentially increasing the volume and timeliness of threat-relatable data. Advanced log analytics capabilities and AI-powered event fusion processes are identifying more incidents earlier along the kill-chain and consequently providing more opportunities to conditionally mitigate a budding threat or disrupt a sequence of suspicious events. 

To successfully capitalize on that shrinking window of opportunity, responses need to occur at super-human speeds. The speed bump introduced by requiring a human in that response loop will increasingly materialize as the difference between having been attacked versus being breached. In this context, the Virtual Analyst represents the super-human capabilities AND responsibilities for real-time threat identification AND trusted automated mitigation of a live incident.

Although that Virtual Analyst capability will be tightly bound to a product (e.g., Cloud SIEM, SOC-as-a-Service), the second Virtual Analyst motion centers around access to deep security expertise.

If a product-bound Virtual Analyst can be considered a quick-learning high-speed generalist, the second motion can be thought of as a flexible “on-call” specialist—augmenting the security operations team’s investigative and response capabilities as needed—and may be conceptually akin to the on-demand specialist services provided by traditional managed security service and incident response providers. 

The differentiated value of cloud-based Virtual Analyst solutions will lie in leveraging broader internet-spanning datasets for threat detection and attribution, and powerful, rapid, ad hoc forensic-level investigation of incidents and response. For example, the in-house SOC team may engage the Virtual Analyst to augment an ongoing investigation by temporarily connecting it to their on-premises SIEM, and receive targeted direction for capturing and collecting incident-relevant non-SIEM data (e.g., PCAPs, VM images, storage snapshots, configuration files) that are uploaded and automatically investigated by the virtual analyst as well as incorporated for real-time instruction on system recovery and attack mitigation.

It’s tempting to think that on-premises security analysts’ days are numbered. Virtual analyst advancements will indeed increase the speed, fidelity, and efficacy of threat detection and incident response within the enterprise—replacing almost all repeated and repeatable analyst tasks. But AI-powered virtual analyst solutions will do so with little knowledge or context about the business and its priorities. 

With the day-to-day noise and incident investigation drudgery removed, security operations teams may evolve into specialist business advisors—partnering with business teams, articulating technology risks, and providing contextual security guidance.

-- Gunter Ollmann

First Published: SecurityWeek - March 23, 2021

Thursday, September 17, 2020

Enterprise Threat Visibility Versus Real-World Operational Constraints

The phrase “assume breach” has been transformational to enterprise security investment and defensive strategy for a few years but may now be close to retirement. 

When the vast majority of information security expenditure was focused on impermeable perimeter defenses and reactive response to evidence-based compromise, it served as a valuable rallying cry for organizations to tool their enterprise for insider-threat detection, adopt zero-trust network segmentation, and pursue widespread deployment of multifactor authentication systems and conditional access controls.

Sizable investments in enterprise-wide visibility should have reversed the much older adage “a defender needs to be right all the time, while the attacker needs to be right only once” into something like “an attacker needs to be invisible all the time, while the defender needs them to slip up only once.” Unfortunately, security operations and threat-hunting teams have found that instead of automatically spotting needles in a haystack, they must now manage haystacks of needles—if they’re properly equipped. For under-resourced security teams (which appears the majority), advances in enterprise-wide visibility have in the best case added hundreds of daily alerts to their never-completed to-do lists.


As security budgets have morphed, a higher percentage of spend has been allocated to increasing visibility on the premise that more threats will be preemptively detected, blocked, and mitigated.

An appropriate analogy for the situation would be installing dozens of video cameras in and around your home with overlapping fields of view and relying on that as the primary alerting mechanism for preventing break-ins. The primary assumption is that someone will be continually monitoring all those video feeds, will recognize the build up and execution of the break-in, and can initiate a response to stop the thief. 

The consequences of such a strategy (by way of continuing the analogy) are pretty obvious:

  1. Because 24/7 monitoring is expensive, automated detection is required. Automatic detection comes at the cost of high false-positive rates and baseline tuning; in home CCTV terms, ignoring the rabbits, golf balls, and delivery men that cross a field of vision, while desensitizing movement thresholds and setting up hot zones for alerting. Even rarish false positive events such as lighting strikes during a storm or the shadow of a passing airplane are unfortunately enough to fill an inbox or message tray and result in wariness delays and wasted investigative cycles. To counter the problem, use at least two disparate and independent detection technologies to detect and confirm the threat (for example, CCTV movement zones and a break-glass sensor).
  2. Automatic detection without an automatic response limits value to post-break-in cleanup and triage—not prevention. Because of potential false positives, automatic responses also need to be reversible throughout the period of alert response. If CCTV movement and break-glass sensors are triggered, perhaps an automatic request for a patrol car visit is initiated. Meanwhile the original alert recipient can review footage and cancel the callout if it was clearly a false positive (e.g., the neighbor’s kids kicked a ball over the fence and broke a window).
  3. Balance between detection and prevention is critical and will change over time. 24/7 CCTV monitoring may serve as a key detection capability, but locking all external doors with deadbolts shouldn’t be neglected. Deadbolted doors won’t stop the future threat of a $50 miniature drone flying down the chimney and retrieving the spare front-door key laying on the kitchen table. Prevention investments tend to be threat reactive, while modern detection technologies tend to be increasingly successful in identifying behavioral anomalies.

“Assume breach” served its purpose in changing the ways organizations thought about and invested in their security technologies (and operational programs). As with many well-intentioned initiatives, the security pendulum may have swung a little too far and now needs a balanced redressing.

Although I think cloud-SIEM and the advanced machine intelligence platforms being wedded to it will eventually meet most organizations’ 24/7 visibility and detection needs, SecOps teams will continue to battle against both alert fatigue and posture fatigue. The phrase I’d like to see the industry focus on for the next five years is “automatically mitigated.”

-- Gunter Ollmann

First Published: SecurityWeek - September 17, 2020

Thursday, November 14, 2019

Securing Autonomous Vehicles Paves the Way for Smart Cities

As homes, workplaces, and cities digitally transform during our Fourth Industrial Revolution, many of those charged with securing this digital future can find it difficult to “level up” from the endpoints and focus on defining and solving the larger problem sets. It is easy to get bogged down in the myriad of smart and smart-enough devices that constitute “IoT” in isolation of the overall security scope of the smart city – losing both valuable context and constraints.

While “smart city” can mean a bunch of things to different people, for city planners and officials, it’s definition and implementation problems are quite well understood. The vendors that come knocking on their doors promote point solutions – smart traffic control systems, 5G and ultra-high bandwidth wireless communications, driverless vehicles, etc. – leaving the cities’ IT, operational technology (OT), and infosec teams to bring it all together.

An essential part of a security professional’s work is diving deep into the flaws and perils of individual products and clusters of technologies. But trying to “solve security” at a city level is an entirely different paradigm.


A substantial number of my peers and security researchers I’ve worked with over the past couple of decades have focused their energies on securing autonomous vehicles. The threats are varied – ranging from bypassing emission and speed controls to evading the next generation of city road taxes and insurance regulations to malicious remote control of someone else’s vehicle – yet mostly isolated to the vehicles themselves. From what I’m seeing and hearing, they’re doing a great job in securing these vehicles. Their security successes also advance traditional transit solutions, which helps smart cities keep pace with the transportation needs of a growing population. 

Given the continued urbanization of human population, the growth and attraction of megacities (10 million plus inhabitants), and the strains on traditional transport systems, the thought of increasing personal-use autonomous vehicles in these heavily congested cities is outdated and arguably ludicrous. Today’s megacities are already battling traffic congestion with zoned charging, elimination of fossil fuels, and outright banning of private transport. Tomorrow’s megacities – jumping from 33 cities today with the largest holding 38 million people to over 100 with populations in excess of 88 million people by 2100 – need to completely rethink their transport systems and the security that goes with it.

Oddly enough, securing mass transit for megacities come with some advantages. Mass transport systems that evolve from trains, trams, and subways, have embedded within them design constraints that positively influence security. For example, driverless cars of today have to navigate and solve all kinds of road and traffic problems while trams stick to pre-defined paths (i.e. rail networks) with greatly simplified routing and traffic signaling. Research papers covering adversarial AI in recent years have focused on attacking deep learning and cognitive AI systems used by autonomous vehicles (e.g. adding stickers to a stop sign and making the driverless car think the sign says 45 mph), but these tactics would have negligible to no impact on reasonably scoped public transport systems.

It is reasonable to assume that the smart cities of the near future will consist of trillions of smart devices – each of them semi or fully managed, providing alerts, logs, and telemetry of their operations. For those city leaders – particularly CIOs, COOs, CTOs, CISOs, and CSOs – the changes needed to manage, secure, certify, and govern all these devices and their output are mind bogglingly huge.

Interestingly enough, the framework for managing data security for millions of chatty networked devices has largely been solved. Having become cloud-native, modern Security Incident and Event Management (SIEM) technologies have proved to be remarkably successful in identifying anomalies, attacks, and misconfigurations.

The data handling capabilities and scalability of cloud-native SIEM may be just the right kind of toolkit to begin to solve smart city operations (and security) at the megacity level. In addition, with advanced AI being a core component of SIEM, the systems that identify and construct attack kill chains and mitigate threats through conditional access rules could instead be used and trained to identify surge transport requirements (due to concerts ending on a rainy day) and automatically reroute and optimize tram or bus capacity to deliver citizens safely (and dryly) to their destinations – as an example. 

Securing smart cities offers many opportunities to rethink our assumptions on security and “level up” the discussion to solve problems at the ecosystem level. Advancements in AI analytics and automated response technologies can handle the logs, alerts, and streaming telemetry that contribute to OT infrastructure security for mega cities. In turn, this increase in data volume fine tunes anomaly and behavioral-based detection systems to operate with higher efficiency and fidelity, which helps secure city-wide IT infrastructure.

-- Gunter Ollmann

First Published: SecurityWeek - November 14, 2019

Monday, July 22, 2019

Digital Transformation Makes the Case for Log Retention in Cloud SIEMs

As organizations pursue their digital transformation dreams, they’ll migrate from on-premises SIEM to cloud-based SIEM. In the process of doing so, CISOs are taking a closer look at their previous security incident and event log retention policies, and revisiting past assumptions and processes.

For organizations needing to maintain a smorgasbord of industry compliance and regulatory requirements, overall event log retention will range from one year through to seven. Many organizations find that a minimum of one year meets most mandated requirements but err on the side of retaining between three to four years – depending on what their legal counsel advises.

With public cloud, data retention spans many different options, services, and price points. Backups, blob storage, “hot” access, “cold” access, etc. – there are endless ways to store and access security events and logs. With cloud storage dropping in price year-on-year, it’s cheap and easy to just store everything forever – assuming there’s no rush or requirement to inspect the stored data. But hot data, more expensive than the cold option, gives defenders the quick access they need for real-time threat hunting. Keeping data hot for SIEM use is inevitably one of the more expensive data storage options. A balance needs to be struck between having instant access to SIEM for queries and active threat hunting, and long-term regulatory-driven storage of event and log data. Can an optimal storage balance be achieved?


Widely available public threat reports for the last couple of years provide a “mean-time” to breach discovery ranging from 190 to 220 days and a breach containment window of between 60 to 100 days. Therefore, keeping 220 days of security event logs “hot” and available in a cloud SIEM would statistically only help with identifying half of an organization’s breaches. Obviously, a higher retention period makes sense – especially for organizations with less mature or less established security operations capabilities.

However, a sizable majority of SIEM-discoverable threats and correlated events are detectable in a much shorter timeframe – and rapidly detecting these breaches naturally makes it considerably more difficult for an adversary to maintain long-time persistence. For example, automatically piecing together the kill chain for an email phishing attack that led to a malware installation, that phoned home to a malicious C&C, which had then brute-forced the administrative access to a high value server is almost trivial for cloud SIEM (assuming appropriate logging was enabled). Nowadays, such a scenario (or permutation of that scenario) likely accounts for near half of all enterprise network breaches.

My advice to organizations new to cloud SIEM is to begin with a rolling window of one year’s worth of event logs while measuring both the frequency of breaches and time to mitigate. All older event logs can be stored using cheaper cloud storage options and needn’t be immediately available for threat hunting.

Depending on the security operations teams’ capacity for mitigating the events raised by cloud SIEM, it may be financially beneficial to reduce the rolling window if the team is overwhelmed with unresolvable events. I’d be hesitant to reduce that rolling window. Instead, I would recommend CISOs with under-resourced teams find and engage a managed security services provider to fill that skills gap.

A question then arises as to the value of retaining multiple years of event logs. Is multi-year log retainment purely a compliance tick-box?

While day-to-day cloud SIEM operations may focus on a one-year rolling window, it can be beneficial to organize a twice-annual threat hunt against several years of event logs using the latest available threat intelligence and indicator of compromise (IoC) information as seeds for investigation. These periodic events have two objectives: reduce your average monthly cloud SIEM operating costs (by temporarily loading and unloading the historic data) and allow teams to change mode and “deep dive” into a broader set of data while looking for “low and slow” compromises. If an older breach is detected, incrementally older event logs could be included in the quest to uncover the origin point of an intruder’s penetration or full spectrum of records accessed.

Caution over infinite event log retention may be warranted, however. If the breached organization only has a couple years of logs, versus being able to trace breach inception to, say, four years earlier, their public disclosure to customers may sound worse to some ears (including regulators). For example, disclosing “we can confirm customers over the last two years are affected” is a weaker disclosure than “customers since July 4th 2015 are affected”. Finding the sweet-spot in log retention needs to be a board-level decision.

Having moved to cloud SIEM, CISOs also need to decide what logs should be included and what log settings should be used.

Ideally, all event logs should be passed to the cloud SIEM. That is because the AI and log analytics systems powering threat detection and automated response thrive on data. Additionally, inclusion of logs from the broadest spectrum of enterprise devices and applications will help reduce detection times and remove potential false positives, which increase overall confidence in the system’s recommendations.

Most applications and networked appliances allow for different levels of logging, including scaling from error messages to alerts and error messages through to errors, warnings, status messages, and debugging information. In general, the greater the detail in the event logs, the greater the value they bring to cloud SIEM. In this way, upgrading from “normal” to “verbose” log settings can offer several threat response advantages – particularly when it comes to handling misconfigurations and criticality determination.

The symbiotic development of cloud SIEM and cloud AI innovation continues at an astounding pace. While cloud SIEM may be new for most organizations, its ability to harness the innate capabilities of public cloud are transforming security operations. Not only are threats being uncovered quicker and responses managed more efficiently, but continual advancements in the core AI makes the technology more valuable while costs of operating SIEM and storing data in the cloud continue to drop. This makes it possible for companies to make pragmatic use of the intelligent cloud by operating on a one-year window of hot data while getting value out of older data, stored cold, on twice a year threat hunts.

-- Gunter Ollmann

First Published: SecurityWeek - July 22, 2019

Tuesday, April 30, 2019

To Reach SIEM’s Promise, Take a Lesson From World War II

With two of the largest public cloud providers having launched their cloud Security Information and Event Management (SIEM) products and an inevitability that the remainder of the top 5 cloud providers will launch their own permutations some time this year, 2019 is clearly the year of the cloud SIEM.

For an on-premises technology that has been cursed with a couple decades of over-promising, under-achieving, and eye-watering cost escalation, modernizing SIEM into a cloud native security technology is a watershed moment for the InfoSec community.

The promise of finally being able to analyze all the logs, intelligence, and security data of an enterprise in real-time opens the door to many great and obvious things. We can let the SIEM vendors shout about all the obvious defensive value cloud SIEM brings. Instead, I’d like to focus on a less obvious but arguably more valuable long-term contribution that a fully capable cloud SIEM brings to enterprise defense.

Assuming an enterprise invests in bringing all their network logs, system events, flow telemetry, and security events and alerts together into the SIEM, businesses will finally be able to track threats as they propagate in an environment. Most importantly, they’ll be able to easily identify and map the “hotspots” of penetration and compromise, and remedy accordingly.

A unified view will also allow analysts and security professionals to pinpoint the spots where compromises remain hidden from peering eyes. As enterprises strive to deploy and manage an arsenal of threat detection, configuration management, and incident response tools in increasingly dynamic environments, visibility and coverage wax and wane with each employee addition, wireless router hook-up, application installation, or SaaS business connection. Those gaps, whether temporary or permanent, tend to attract an unfair share of compromise and harm.

In World War II, a gentleman by the name of Abraham Wald was a member of Columbia University’s Statistical Research Group (SRG). One problem SRG was tasked with was examining the distribution of damage to returning aircraft and advise on how to minimize bomber losses to enemy fire. A premise of the research was that the areas of bombers that were most damaged and therefore susceptible to flak should be redesigned and made more robust. Wald noted that such a study was biased to only aircrafts that survived their missions and, if you were to assume that damage was more uniformly distributed to all aircrafts, those that returned had actually been hit in the less vulnerable parts. By mapping the damage done to the surviving aircraft, the “undamaged” areas represented the most vulnerable parts of the aircrafts that didn’t survive to return.


Wald’s revelations and work were seminal in the early days of Operational Research – a discipline of applying advanced analytical methods to help make better decisions. I expect cloud SIEM and the integration of AI systems to usher Operational Research and its associated disciplines into the information security sector. Securing an enterprise is a highly complex and dynamic problem and, because Operational Research is focused on optimizing solutions for complex decision-making problems, it is well suited to finding solutions that balance the multi-faceted aspects of business continuity and risk.

As we’re in the early days for cloud SIEM, I’ve yet to see much in the area of employing native AI to address the cold-spots in enterprise threat visibility. The focus to-date is applying AI in threat hunting and automating the reconstruction of kill chain associated with an in-progress attack and supplementing that visualization with related threat intelligence and historical data artifacts.

Putting on a forecasting hat, I expect much of the immediate adoption and growth of cloud SIEM will be driven by desire to realize the promises of on-premises SIEM, in particular, using supervised-learning systems to automate the detection and mitigation of the threats that have pestered security operations teams for twenty-plus years. Infusing SIEM natively on the cloud provider’s platform also creates end to end visibility into security related events inside a business’ environment and pieces in valuable intelligence from the cloud provider’s operations – thereby harnessing the “cloud effects” of collective intelligence and removing the classic requirement for a “patient zero” to initiate an informed response.

What I hope is, once engineering teams have matured those hunting and mitigation capabilities by weaving in AI decision systems and real-time data processing, the “science” of information security can finally come up for air and move forward.

Leveraging the inherent power and scale of public cloud for real-time analytics of enterprise security data at streaming rates means that we’re at the cusp of finally calculating the ROI of each security technology deployed inside an enterprise. That alone should have many CISOs and CFOs jumping for joy. With all the enterprise security data flowing to one place, the cloud SIEM also becomes the anchor for IT operations – such as tracking the “meantime between failures” (MTBF) of protected systems, providing robustness metrics for software assets and system updates, and surfacing the latent risks of the environments being monitored.

75 years may separate War World II from cloud SIEM, but we’re on the cusp of being able to apply the hard-earned learnings from Abraham Wald in our latest adversarial conflict – the cyberwar.

-- Gunter Ollmann

First Published: SecurityWeek - April 30, 2019

Thursday, March 8, 2018

NextGen SIEM Isn’t SIEM


Security Information and Event Management (SIEM) is feeling its age. Harkening back to a time in which businesses were prepping for the dreaded Y2K and where the cutting edge of security technology was bound to DMZ’s, Bastion Hosts, and network vulnerability scanning – SIEM has been along for the ride as both defenses and attacker have advanced over the intervening years. Nowadays though it feels less of a ride with SIEM, and more like towing an anchor.

Despite the deepening trench gauged by the SIEM anchor slowing down threat response, most organizations persist in throwing more money and resources at it. I’m not sure whether it’s because of a sunk cost fallacy or the lack of a viable technological alternative, but they continue to diligently trudge on with their SIEM – complaining with every step. I’ve yet to encounter an organization that feels like their SIEM is anywhere close to scratching their security itch.



The SIEM of Today
The SIEM of today hasn’t changed much over the last couple of decades with its foundation being the real-time collection and normalization of events from a broad scope of security event log sources and threat alerting tools. The primary objective of which was to manage and overcome the cacophony of alerts generated by the hundreds, thousands, or millions of sensors and logging devices scattered throughout an enterprise network – automatically generating higher fidelity alerts using a variety of analytical approaches – and displaying a more manageable volume of information via dashboards and reports.

As the variety and scope of devices providing alerts and logs continues to increase (often exponentially) consolidated SIEM reporting has had to focus upon statistical analytics and trend displays to keep pace with the streaming data – increasingly focused on the overall health of the enterprise, rather than threat detection and event risk classification.

Whilst the collection of alerts and logs are conducted in real-time, the ability to aggregate disparate intelligence and alerts to identify attacks and breaches has fallen to offline historical analysis via searches and queries – giving birth to the Threat Hunter occupation in recent years.

Along the way, SIEM has become the beating heart of Security Operations Centers (SOC) – particularly over the last decade – and it is often difficult for organizations to disambiguate SIEM from SOC. Not unlike Frankenstein’s monster, additional capabilities have been grafted to today’s operationalized SIEM’s; advanced forensics and threat hunting capabilities now dovetail in to SIEM’s event archive databases, a new generation of automation and orchestration tools have instantiated playbooks that process aggregated logs, and ticketing systems track responder’s efforts to resolve and mitigate threats.

SIEM Weakness
There is however a fundamental weakness in SIEM and it has become increasingly apparent over the last half-decade as more advanced threat detection tools and methodologies have evolved; facilitated by the widespread adoption of machine learning (ML) technologies and machine intelligence (MI).

Legacy threat detection systems such as firewalls, intrusion detection systems (IDS), network anomaly detection systems, anti-virus agents, network vulnerability scanners, etc. have traditionally had a high propensity towards false positive and false negative detections. Compounding this, for many decades (and still a large cause for concern today) these technologies have been sold and marketed on their ability to alert in volume – i.e. an IDS that can identify and alert upon 10,000 malicious activities is too often positioned as “better” than one that only alerts upon 8,000 (regardless of alert fidelity). Alert aggregation and normalization is of course the bread and butter of SIEM.

In response, a newer generation of vendors have brought forth new detection products that improve and replace most legacy alerting technologies – focused upon not only finally resolving the false positive and false negative alert problem, but to move beyond alerting and into mitigation – using ML and MI to facilitate behavioral analytics, big data analytics, deep learning, expert system recognition, and automated response orchestration.

The growing problem is that these new threat detection and mitigation products don’t output alerts compatible with traditional SIEM processing architectures. Instead, they provide output such as evidence packages, logs of what was done to automatically mitigate or remediate a detected threat, and talk in terms of statistical risk probabilities and confidence values – having resolved a threat to a much higher fidelity than a SIEM could. In turn, “integration” with SIEM is difficult and all too often meaningless for these more advanced technologies.

A compounding failure with the new ML/MI powered threat detection and mitigation technologies lies with the fact that they are optimized for solving a particular class of threats – for example, insider threats, host-based malicious software, web application attacks, etc. – and have optimized their management and reporting facilities for that category. Without a strong SIEM integration hook there is no single pane of glass for SOC management; rather a half-dozen panes of glass, each with their own unique scoring equations and operational nuances.

Next Generation SIEM
If traditional SIEM has failed and is becoming more of a bugbear than ever, and the latest generation of ML and MI-based threat detection and mitigation systems aren’t on a trajectory to coalesce by themselves into a manageable enterprise suite (let alone a single pane of glass), what does the next generation (i.e. NextGen) SIEM look like?

Looking forward, next generation SIEM isn’t SIEM, it’s an evolution of SOC – or, to license a more proscriptive turn of phrase, “SOC-in-a-box” (and inevitably “Cloud SOC”).

The NextGen SIEM lies in the natural evolution of today’s best hybrid-SOC solutions. The Frankenstein add-ins and bolt-ons that have extended the life of SIEM for a decade are the very fabric of what must ascend and replace it.

For the NextGen SIEM, SOC-in-a-box, Cloud SOC, or whatever buzzword the professional marketers eventually pronounce – to be successful, the core tenets of operation will necessarily include:
  • Real-time threat detection, classification, escalation, and response. Alerts, log entries, threat intelligence, device telemetry, and indicators of compromise (IOC), will be treated as evidence for ML-based classification engines that automatically categorize and label their discoveries, and optimize responses to both threats and system misconfigurations in real-time.
  • Automation is the beating heart of SOC-in-a-box. With no signs of data volumes falling, networks becoming less congested, or attackers slackening off, automation is the key to scaling to the businesses needs. Every aspect of SOC must be designed to be fully autonomous, self-learning, and elastic.
  • The vocabulary of security will move from “alerted” to “responded”. Alerts are merely one form of telemetry that, when combined with overlapping sources of evidence, lay the foundation for action. Businesses need to know which threats have been automatically responded to, and which are awaiting a remedy or response.
  • The tier-one human analyst role ceases to exist, and playbooks will be self-generated. The process of removing false positives and gathering cohobating evidence for true positive alerts can be done much more efficiently and reliably using MI. In turn, threat responses by tier-two or tier-three analysts will be learned by the system – automatically constructing and improving playbooks with each repeated response.
  • Threats will be represented and managed in terms of business risk. As alerts become events, “criticality” will be influenced by age, duration, and threat level, and will sit adjacent to “confidence” scores that take in to account the reliability of sources. Device auto-classification and responder monitoring will provide the framework for determining the relative value of business assets, and consequently the foundation for risk-based prioritization and management.
  • Threat hunting will transition to evidence review and preservation. Threat hunting grew from the failures of SIEM to correctly and automatically identify threats in real-time. The methodologies and analysis playbooks used by threat hunters will simply be part of what the MI-based system incorporates in real-time. Threat hunting experts will in-turn focus on preservation of evidence in cases where attribution and prosecution become probable or desirable.
  • Hybrid networks become native. The business network – whether it exists in the cloud, on premise, at the edge, or in the hands of employees and customers – must be monitored, managed, and have threats responded to as a single entity. Hybrid networks are the norm and attackers will continue to test and evolve hybrid attacks to leverage any mitigation omission.

Luckily, the NextGen SIEM is closer than we think. As SOC operations have increasingly adopted the cloud to leverage elastic compute and storage capabilities, hard-learned lessons in automation and system reliability from the growing DevOps movement have further defined the blueprint for SOC-in-a-box. Meanwhile, the current generation of ML-based and MI-defined threat detection products, combined with rapid evolution of intelligence graphing platforms, have helped prove most of the remaining building blocks.

These are not wholly additions to SIEM, and SIEM isn’t the skeleton of what will replace it.

The NextGen SIEM starts with the encapsulation of the best and most advanced SOC capabilities of today, incorporates its own behavioral and threat detection capabilities, and dynamically learns to defend the organization – finally reporting on what it has successfully resolved or mitigated.

-- Gunter Ollmann