incubator-cvs mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Incubator Wiki] Update of "EagleProposal" by ArunManoharan
Date Mon, 19 Oct 2015 07:02:48 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Incubator Wiki" for change notification.

The "EagleProposal" page has been changed by ArunManoharan:
https://wiki.apache.org/incubator/EagleProposal?action=diff&rev1=6&rev2=7

  
  === Eagle Architecture ===
  
- === Data Collection and Storage: ===
+ ==== Data Collection and Storage: ====
  
  Eagle provides programming API for extending Eagle to integrate any data source into Eagle
policy evaluation framework. For example, Eagle hdfs audit monitoring collects data from Kafka
which is populated from namenode log4j appender or from logstash agent. Eagle hive monitoring
collects hive query logs from running job through YARN API, which is designed to be scalable
and fault-tolerant.
  Eagle uses HBase as storage for storing metadata and metrics data, and also supports relational
database through configuration change.
  
- === Data Processing and Policy Engine: ===
+ ==== Data Processing and Policy Engine: ====
  
  '''Processing Engine:''' Eagle provides stream processing API which is an abstraction of
Apache Storm. It can also be extended to other streaming engines. This abstraction allows
developers to assemble data transformation, filtering, external data join etc. without physically
bound to a specific streaming platform. Eagle streaming API allows developers to easily integrate
business logic with Eagle policy engine and internally Eagle framework compiles business logic
execution DAG into program primitives of underlying stream infrastructure e.g. Apache Storm.
For example, Eagle HDFS monitoring transforms audit log from Namenode to object and joins
sensitivity metadata, security zone metadata which are generated from external programs or
configured by user. Eagle hive monitoring filters running jobs to get hive query string and
parses query string into object and then joins sensitivity metadata.
  
@@ -49, +49 @@

  
  We at eBay want to make sure the sensitive data and data platforms are completely protected
from security breaches. So we partnered very closely with our Information Security team to
understand the requirements for Eagle to monitor sensitive data access on hadoop: 
  
- * Ability to identify and stop security threats in real time
+  1. Ability to identify and stop security threats in real time
- * Scale for big data (Support PB scale and Billions of events)
+  2. Scale for big data (Support PB scale and Billions of events)
- * Ability to create data access policies 
+  3. Ability to create data access policies 
- * Support multiple data sources like HDFS, HBase, Hive
+  4. Support multiple data sources like HDFS, HBase, Hive
- * Visualize alerts in real time
+  5. Visualize alerts in real time
- * Ability to block malicious access in real time
+  6. Ability to block malicious access in real time
  
  We did not find any data access monitoring solution that available today and can provide
the features and functionality that we need to monitor the data access in the hadoop ecosystem
at our scale. Hence with an excellent team of world class developers and several users, we
have been able to bring Eagle into production as well as open source it.
  

---------------------------------------------------------------------
To unsubscribe, e-mail: cvs-unsubscribe@incubator.apache.org
For additional commands, e-mail: cvs-help@incubator.apache.org


Mime
View raw message