Easy Read Time: 7 Minutes

SOAR consists of a series of software solutions and tools to simplify security operations in three key areas: vulnerability and threat management, response, and automation of security operations. Security automation is the automatic handling of security tasks. It is the process of performing certain activities, such as searching for vulnerabilities or checking for logs, without human involvement. Security orchestration describes the way how security tools are connected, and different systems are integrated. It is the connection layer that simplifies security processes and automates security.

Organizations are today faced with many challenges in attaining their strategic security objectives. First, it requires time to identify talent so, when discovering the best candidates, organizations want them to focus on what is most impactful – not get lost in the recurring manual, time-consuming activities.  In general, there are strong chances that the organizations need to use technologies that several departments or teams touch and collaborate on, yet different components do not always integrate.

That is where the orchestration and automation of the security comes in. For an effective solution for security orchestration, automation, and response (SOAR), much can be achieved with less time, while also providing for human decision-making when it becomes most critical. Organizations can go beyond relying on point-to-point integrations for their stack of technology; instead, relying on a solution that empowers them to build up various processes and connect with the right people and technology to attain their security objectives.

Benefits of SOAR

SOAR helps with building workflows and streamlining operations. One approach to be effective with the orchestration layer is to use a framework that comes with a plugin repository for the most-used technologies and a set of pre-built workflows for specific use cases, enabling you to quickly link the application stack and simplify the security and IT processes. You will probably need to create different orchestrations or workflows that are specific to the team, so providing pre-built models or building blocks that are simple to use will help accelerate the process.

SOAR can help to Improve versatility, collaboration, and extensibility. A security orchestration, automation, and response approach can give you versatility and additional collaborative opportunities. If it is adjusting workflows for your organization, developing and handling integrations, or building entirely new systems, it is essential to look for a vendor that is willing to partner with you. A partnership established to last, with a community perspective, will help you improve your security system in achieving your security orchestration and automation goals. Your partner is expected to set you up for success and collaborate for you to accomplish your goals. They should consider the use cases that you are trying to automate, and help you find options that you may not even have considered before, all supported by easy-to-understand documentation and support.

Splunk Phantom

Security teams are working hard to identify, analyze, and mitigate the threats their organizations face. Such teams also deal with an infinite assembly line of point products and independent, static security checks with very little orchestration between them. Add the fact that most companies do not have adequate security resources to analyze their regular security alerts volume and the result is an increasing backlog of security incidents. Organizations aim to allow greater use of existing resources by implementing strategies that optimize efficiency and size, thus establishing a unified network of security that is greater than the sum of its parts.

Splunk Phantom can maximize SOC efficiency With Security Orchestration, Automation, and Response (SOAR) Capabilities. Splunk Phantom offers security SOAR capabilities that allow analysts to boost productivity and shorten response times for incidents. Phantom is supercharging the security automation scalability, performance, and speed with the ability to handle 50,000 threat incidents in an hour. Using Phantom, companies can improve security and handle risk better by the integration of teams, systems, and resources together. Security teams can automate tasks, orchestrate workflows, and support a broad range of functions including incident and case management, collaboration, and monitoring for security operations centers (SOC).

The diagram shows the end-to-end flow of security automation in Splunk Phantom.

Orchestration

Phantom is the connective tissue that helps existing security resources to function together better. By connecting and coordinating complex workflows across the team and tools of the SOC, Phantom ensures that each part of the layered defense of the SOC is actively involved in a unified strategy for defense and security. Powerful abstraction enables teams to concentrate on what they need to achieve, as the framework converts it into tool-specific actions.

SOC Automation

Phantom helps teams to operate faster by executing automated activities in seconds through their security infrastructure as compared to be completed in hours and more when performed manually. Teams can use the visual editor (no code required) or the integrated Python development framework to codify workflows into Phantom’s streamlined playbook. Teams can focus their attention on making the most mission-critical decisions by offloading these repetitive tasks.

Incident Response

Phantom allows the security teams to analyze and respond more effectively to threats. Using the automated detection investigation and response capability of Phantom, teams can perform response actions at machine speed, decreasing the time spent on malware detection and decrease their overall mean time to solve. With Phantom on Splunk Mobile, analysts can use their mobile devices while on-the-go to respond to security incidents. The incident functionality of Phantom and case management can further streamline the security operations. Data and activity related to incidents are readily available from one central repository. It’s easy to chat about an incident or case with other team members, and delegate activities and responsibilities to the team member involved.

The main components of Splunk Phantom each play a role in delivering end-to-end security automation.

Component Description
App An App expands the Splunk Phantom framework by providing access to third party security technologies. The interfaces enable Splunk Phantom to view and execute activities that are given the third party technologies. Some apps may also provide a visual feature such as widgets, which can be used to render the app’s generated data.
Asset An Asset is a single instance of an application. Each asset represents a physical or virtual unit, such as a server, gateway, router, or firewall inside your company. For example, you have a firewall app Palo Alto Network (PAN), which connects the firewall to Splunk Phantom. Then configure an asset to a specific firewall with the specific connection details. If you have multiple firewalls in your system you can configure one asset per each firewall.
Container A container is a security event that is fed into Splunk Phantom. Containers have the default Events label. Labels are used to label containers related to one another. Containers from the same product for example can all have the same label. You can then perform a playbook of the same label against all containers. In Splunk Phantom, you can make custom labels as required
Case A case is a special category of containers that could hold certain containers. For example, if, for a security event, you have several closely related containers, you can introduce one of those containers to a case and then add the other related containers to the case. This helps you to organize the investigation, rather than investigate each container separately.
Artifact An artifact is a piece of information that is applied to a container, such as a hash file, IP address, or header of an email.
Indicator, or Indicator of Compromise (IOC) An indicator is a piece of data such as an IP address, hostname, or file hash that populates fields in an artifact in the Common Event Format (CEF). Indicators are the smallest data unit that Splunk Phantom can use to act upon.
Playbook A Playbook defines a series of automation tasks that work on new data entering Splunk Phantom. For example, you configure a playbook with a specific label to run specifically against all new containers. Or you can configure operating a playbook in a workbook as part of the workflow.
Workbook A workbook is a template that includes a list of the standard tasks that analysts can perform when analyzing containers or cases.
Action An Action is a high-level primitive that is used across the Splunk Phantom, such as getting file dump, block IP, suspend VM or end process. Actions are run from the Splunk Phantom web interface in playbooks, or manually. Splunk Phantom’s actions are made accessible via applications.
Owner An Owner is responsible for managing the organization’s assets. Owners are given permissions which are requests to carry out a specific action on a particular asset. Approvals are submitted to asset owners that provide a Service Level Agreement (SLA) dictating the expected response time. Approvals are first sent to owners of primary assets. If the SLA is infringed the approval would be redirected to the owner of the secondary assets. Events, phases, and tasks can be set to SLAs.

Apache Airflow

Apache Airflow is an open-source data workflow management and orchestration tool which was initially developed in 2014 at Airbnb. We can think of the following example use cases in terms of data workflows it covers:

  • Automate training, testing, and implementing a machine learning model
  • Ingestion of Data from Several REST-API
  • Coordination of the extraction processes, transformation and loading (ETL) or extraction, loading and transformation (ELT) processes across a business data lake

One of Airflow’s key strengths is its flexibility: it can be used for several different scenarios on the data workflow. It has gained considerable traction over the years due to this aspect and its rich feature set, having been evaluated for battle in many companies, from startups to Fortune 500 enterprises. Spotify, Twitter, Walmart, Slack, Robinhood, Reddit, PayPal, Lyft, and, of course, Airbnb are some examples.

Apache Airflow was declared as a Top-Level Project by the Apache Software Foundation. It has since gained tremendous popularity in the data community going beyond hardcore data engineers. Currently, Airflow is used to solve several problems about data ingestion, preparation, and consumption. A key issue addressed by Airflow is the integration of data between various systems such as behavioral analytics systems, CRMs, data warehouses, data lakes, and BI tools used for deeper analysis and AI. In addition, Airflow can orchestrate complex ML workflows. Airflow is designed as a configuration-as-a-code system, and plug-ins can be heavily customized. Airflow uses workflows created from Directed Acyclic Graphs (DAGs) of Tasks.

(Note: The airflow is in incubator status at this stage. The Apache Software Foundation has also not fully approved the program in the Apache Incubator.)

A DAG is a nodes and connectors build (also called “edges”) where the connectors have direction, and you can start moving through all the connectors at any arbitrary node. Every connector is once traversed. Trees and topologies of the network are DAG type. Airflow workflows include tasks whose output is an input for another task. Hence the ETL method is a form of DAG as well. The output is used as the input of the next step of each step so you can’t loop back to a previous step. Defining workflows in the code makes maintenance, testing, and versioning simpler. Airflow is not a platform for data streaming. Tasks represent the movement of data, but they do not move the date in themselves. It is not, thus, a tool for interactive ETL. Airflow is a Python script that describes a DAG object for the airflow. The object can then be used for the ETL process in Python. Jinja Templating provides optimized parameters and macros for Python programming (Jinja is a PYTHON templating language modeled after Django templates).

Apache Airflow is a common toolbox for the data supporting custom plugins. Such plugins can add functionality, effectively interact with various data storage platforms (i.e. Amazon Redshift, MySQL), and manage more complex data and metadata interactions. Airflow is built in connection with these platforms, integrated with Amazon Web Services (AWS) and Google Cloud Platform (GCP), which includes BigQuery. Airflow hooks and operators are available for AWS and GCP, and additional integrations may be required as the airflow matures.

Benefits and features

Now we know where the airflow is used.  There’s plenty of other systems offering similar functionalities to Airflow, but there are some reasons why Airflow wins every time.

Community Support

Airflow was started by Airbnb back in 2015.  Since that time the Airflow Community has grown. We have over 1000 contributors to Airflow and the number is increasing at a healthy rate.

 Extensibility and Flexibility

Apache Airflow is extendable allowing it to suit a custom case. The ability to incorporate custom hooks/operators and other plug-ins allow customers to easily execute new case applications and not rely solely on the airflow operators. Built by many data engineers, Airflow is a complete solution and solves countless data engineering use cases. While Airflow is not perfect, the community works on several important features that are necessary for enhancing Airflow platform functionality and stability.

Generation of dynamic pipeline

Airflow pipelines, which enable dynamic pipeline generation, are configuration as code (Python). The processing of data is nonlinear and static. This allows the creation of code that creates pipeline instances dynamically. Airflow models a declaration based on dependency as compared to a step-by-step declaration more closely. Steps in small units can be defined but this breaks down quickly with more steps. Airflow is available to help simplify this work modeling which provides linear flow based upon declared dependencies. The feature of code pipelines is that they allow modification and transparency. Airflow is much simpler than most solutions and provides greater detail and accountability to improvements overtime to help forward and roll-back solutions. Although not many consumers are using Airflow like this, Airflow will evolve with you as the data practice progresses.

References:

https://www.rapid7.com/solutions/security-orchestration-and-automation/

https://www.splunk.com/pdfs/product-briefs/splunk-phantom-maximize-your-soc-efficiency-with-soar.pdf

https://www.splunk.com/en_us/software/splunk-security-orchestration-and-automation/features.html

https://docs.splunk.com/Documentation/Phantom/4.9/User/Intro

https://www.alooma.com/answers/what-is-apache-airflow