incubator-cvs mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <>
Subject [Incubator Wiki] Update of "MetronProposal" by OwenOmalley
Date Mon, 30 Nov 2015 16:37:03 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Incubator Wiki" for change notification.

The "MetronProposal" page has been changed by OwenOmalley:

New page:
= Apache Metron Proposal =

== Abstract ==

The Metron project is an open source project dedicated to providing an extensible and scalable
advanced security analytics tool. It has strong foundations in the Apache Hadoop ecosystem.

== Proposal ==

Metron integrates a variety of open source big data technologies in order to offer a centralized
tool for security monitoring and analysis. Metron provides capabilities for log aggregation,
full packet capture indexing, storage, advanced behavioral analytics and data enrichment,
while applying the most current threat-intelligence information to security telemetry within
a single platform.

Metron can be divided into 4 areas:

  1. '''A mechanism to capture, store, and normalize any type of security telemetry at extremely
high rates.''' Because security telemetry is constantly being generated, it requires a method
for ingesting the data at high speeds and pushing it to various processing units for advanced
computation and analytics.
  1. '''Real time processing and application of enrichments''' such as threat intelligence,
geolocation, and DNS information to telemetry being collected. The immediate application of
this information to incoming telemetry provides the context and situational awareness, as
well as the “who” and “where” information that is critical for investigation.
  1. '''Efficient information storage''' based on how the information will be used:
    a. Logs and telemetry are stored such that they can be efficiently mined and analyzed
for concise security visibility
    a. The ability to extract and reconstruct full packets helps an analyst answer questions
such as who the true attacker was, what data was leaked, and where that data was sent
    a. Long-term storage not only increases visibility over time, but also enables advanced
analytics such as machine learning techniques to be used to create models on the information.
Incoming data can then be scored against these stored models for advanced anomaly detection.
  1. '''An interface that gives a security investigator a centralized view of data and alerts
passed through the system.''' Metron’s interface presents alert summaries with threat intelligence
and enrichment data specific to that alert on one single page. Furthermore, advanced search
capabilities and full packet extraction tools are presented to the analyst for investigation
without the need to pivot into additional tools.

Big data is a natural fit for powerful security analytics. The Metron framework integrates
a number of elements from the Hadoop ecosystem to provide a scalable platform for security
analytics, incorporating such functionality as full-packet capture, stream processing, batch
processing, real-time search, and telemetry aggregation. With Metron, our goal is to tie big
data into security analytics and drive towards an extensible centralized platform to effectively
enable rapid detection and rapid response for advanced security threats.

== Background ==

OpenSOC was developed by Cisco over the last two years and pushed out to Github (
under the ALv2. However, the development was mostly closed and has largely stopped. As evidence
of the inactivity, users have complained that pull requests are not answered for a while
Finally, no public releases of OpenSOC have been made. From an Apache point of view, the current
community is not viable.

However, some of the developers of the project have left Cisco and have found interest from
several others that would like to work together to form an active and open community at Apache
starting from the current OpenSOC code base. A message to the current support group proposing
moving to Apache got a single positive response.

Because Cisco is not currently interested in being involved, the project expects to change
their name. The project would like to use Metron, although we will perform a podling name
search to check for conflicts.  Metron, meaning measure, is half of the greek root for the
word 'telemetry.'  Metron is also a DC Comics character who “... wanders in search of greater
knowledge beyond his own”.  

== Rationale ==
Metron strives to move the state of the art in security analytics forward.  We want to move
away from the proprietary nature of legacy security point tools and develop an open platform
where people can contribute and share datasets, machine learning models, telemetry parsers,
sources of telemetry enrichment, and threat intelligence feeds.  Cyber security is too large
of a problem for a single corporation to tackle on its own and the current tooling is too
fragmented and proprietary for us to be able to rally around a single tool or vendor.  

In addition to being open and facilitating advancement in security analytics, Metron has several
advantages over a conventional Security Information Management System (SIEM).  

  * Metron uses all open source stack under the hood and runs on commodity hardware.  This
means Metron is much cheaper to run then the competition.   In security cost plays a major
factor because the cost of your countermeasure for monitoring and reacting to a threat should
not exceed the cost of what is being protected.  By driving down the cost of security the
economics works for more assets to be monitored, which means more secure data centers.  
  * Metron, being in the open, allows additional vetting and scrutiny by the open source community
for all of its components.  This is a better model for a security-oriented tool than doing
it closed source.  All the problems should be flushed out and fixed in the open. The closed
source competition does not have this kind of rigor, is motivated by marketing and sales,
and thus, does not inspire confidence when it comes to security.
  * Being Hadoop-based, Metron can process unprecedented volumes of streaming data via Apache
Storm.  When an organization is hit with malware or malicious behavior most commonly this
happens as a part of a global malware campaign, signatures for which are known and are available
from third party threat intelligence feeds.  Having the ability to take in all the feeds and
reference them against every telemetry message processed by Metron in real time does not only
facilitate detection of such campaigns, it changes the economics for the “bad guys”. 
If you have to customize your malware for each of your targets these global attacks become
a lot more expensive and non viable for them.
  * Metron strives to shift conventional SOC workflows away from being rules-driven to a more
data-driven approach that incorporates machine learning and a higher degree of automation
and autonomous detection.  The modern threat landscape is too dynamic to be manageable via
static rules alone, which is what conventional SIEMs rely on.  Rule bases tend to bloat, and
if improperly maintained turn themselves into sources of false positive alerts. 

The ability to analyze and model large volumes of data at rest and then being able to push
up the output of that into a stream processor is essential in disrupting the 

== Current Status ==

As stated in the background section, the current community isn’t healthy, which is why we
are proposing moving to Apache Incubator. In this section, we will describe the current state
of the OpenSOC project.

=== Meritocracy ===
The OpenSOC development is controlled by Cisco and pull requests are being ignored. The development
list is private and requests to join are rejected because there is no activity on it. The
goal of moving to Apache is to form a meritocracy where a variety of individuals, regardless
of their current employer, come together and work together. We understand that diversity,
open development, and open governance are critical to being a successful Apache project.

=== Community ===
The OpenSOC project is not responding to pull requests or making releases. The easiest solution
would be to create a variety of forks of the project on github, but that would further fracture
the community and prevent it from reaching critical mass. Our prefered solution is to build
a single large diverse and open community at Apache.

=== Core Developers ===
The core developers of Metron are James Sirota, Charles Porter, and Mark Bittmann. None of
them have experience running an open source project, but they are eager to learn.

=== Alignment ===
The ASF is a natural host for Metron given that it is already the home of Hadoop, HBase, Hive,
Storm, Kafka, Spark and other emerging big data projects. Metron leverages many of Apache
open-source products. We are very interested in a place to develop our community and integrations
with the other Apache big data projects.

== Known Risks ==

=== Orphaned Products ===

The current product developers are all salaried developers at a small number of companies
and thus there is a risk of becoming an orphaned product. However, the companies view Metron
as very important to their product offering and plan to ramp up their work in the space. The
project is unique in the product space and thus has strong potential to become a sustainable

=== Inexperience with Open Source ===
The vast majority of the developers are inexperienced with open source development and the
Apache Way. One of the major hurdles to graduation from the Apache Incubator will be demonstrating
that they have learned the Apache Way and are applying it to how the project is managed. Vinod
Kumar Vavilapalli is an Apache Member and plans on actively working as a committer in the
project. They also have the other mentors to help them learn as they progress.

=== Homogenous Developers ===
The developers are employed by four diverse companies (B23, Hortonworks, Mantech, and Rackspace),
They are distributed across the United States. We hope to attract additional diversity as
an Apache project.

=== Reliance on Salaried Developers ===
Metron is currently being developed exclusively by salaried developers, but the goal of coming
to Apache is to form a community of users and developers that is much more diverse including
non-salaried developers.

=== Relationships with Other Apache Products ===
Metron has a strong relationship and dependency with Apache Flume, Hadoop, HBase, Hive, Kafka,
Spark, and Storm. Being part of Apache’s Incubation community could help with a closer collaboration
among these projects and as well as others.

We note that although there is a superficial resemblance to Apache Eagle, which does security
analysis of Hadoop audit events, the projects are significantly different. In particular,
Metron is focused on analyzing network packet traffic and thus has a very different scope
and scale of events than Eagle.

=== An Excessive Fascination with the Apache Brand ===

While the Apache brand is important, we are much more interested in finding a home for the
project that encourages open development and open governance. We want to form the new community
using the Apache Way with its strong focus on meritocracy, organizational independence, and
open development.

== Documentation ==
The current information on the OpenSOC project is here:
A slide deck presenting background material is here:

== Initial Source ==
The initial code is on github:

== External Dependencies ==
Metron has the following external dependencies:
  * Apache Flume
  * Apache Hadoop
  * Apache HBase
  * Apache Hive
  * Apache Kafka
  * Apache Spark
  * Apache Storm
  * ElasticSearch
  * MySQL

The project understands that it will need to support alternatives for MySQL that are licensed
under a ALv2 compatible license.

== Cryptography ==
Metron will eventually support encryption on the wire, but this is not one of the initial
goals, and we do not expect Metron to be a controlled export item due to the use of encryption.
Metron supports but does not require the Kerberos authentication mechanism to access secured
Hadoop services.

== Required Resources ==

=== Mailing List ===

  * metron-private for private PMC discussions
  * metron-dev for developers
  * metron-commits for all commits
  * metron-users for all users

=== Version Control ===
Git is the preferred source control system.

=== Issue Tracking ===


=== Other Resources ===
The existing code already has unit tests so we will make use of existing Apache continuous
testing infrastructure. The resulting load should not be very large.

== Initial Committers ==
  * Jim Baker < jim.baker at rackspace dot com >
  * Mark Bittmann < mark at b23 dot io >
  * Sheetal Dolas < sheetal at hortonworks dot com >
  * Discovery Gerdes < discovery.gerdes at rackspace dot com >
  * Andrew Hartnett < andrew.hartnett at rackspace dot com >
  * Dave Hirko < dave at b23 dot io >
  * Paul Kehrer < paul.kehrer at rackspace dot com >
  * Brad Kolarov < brad at b23 dot io >
  * Kiran Komaravolu <kkomaravolu at hortonworks dot com >
  * Ryan Merriman < rmerriman at hortonworks dot com >
  * Michael Perez <michael.perez at hortonworks dot com>
  * Charles Porter <Charles.Porter at mcs dot mantech dot com >
  * Sean Schulte < sean.schulte at rackspace dot com >
  * James Sirota < jsirota at hortonworks dot com >
  * Casey Stella < cstella at hortonworks dot com >
  * Bryan Taylor < bryan.taylor at rackspace dot com >
  * Ray Urciuoli < Ray.Urciuoli at mcs dot mantech dot com >
  * Vinod Kumar Vavilapalli < vinodkv at apache dot org >
  * George Vetticaden < gvetticaden at hortonworks dot com >
  * Oskar Zabik < oskar.zabik at rackspace dot com >

== Affiliations ==
The initial committers are employees of:
  * Jim Baker - Rackspace
  * Mark Bittmann - B23
  * Sheetal Dolas - Hortonworks
  * Discovery Gerdes - Rackspace
  * Andrew Hartnett - Rackspace
  * Dave Hirko - B23
  * Paul Kehrer - Rackspace
  * Brad Kolarov - B23
  * Kiran Komaravolu - Hortonworks
  * Ryan Merriman - Hortonworks
  * Michael Perez - Hortonworks
  * Charles Porter - Mantech
  * Sean Schulte - Rackspace
  * James Sirota - Hortonworks
  * Casey Stella - Hortonworks
  * Bryan Taylor - Rackspace
  * Ray Urciuoli - Mantech
  * Vinod Kumar Vavilapalli - Hortonworks
  * George Vetticaden - Hortonworks
  * Oskar Zabik - Rackspace

== Sponsors ==

=== Champion ===
  * Owen O’Malley - Apache IPMC member

=== Nominated Mentors ===
  * Chris Mattmann <mattmann at apache dot org > - Apache IPMC member, NASA
  * Owen O’Malley <omalley at apache dot org > - Apache IPMC member, Hortonworks
  * Billie Rinaldi < billie at apache dot org > - Apache IPMC member, Hortonworks
  * Vinod Kumar Vavilapalli < vinodkv at apache dot org > - Apache IPMC member, Hortonworks

=== Sponsoring Entity ===
We are requesting the Incubator to sponsor this project.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message