eagle-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From h..@apache.org
Subject [52/84] eagle git commit: Merge site source code from https://github.com/geteagle/eaglemonitoring.github.io
Date Mon, 03 Apr 2017 11:55:00 GMT
http://git-wip-us.apache.org/repos/asf/eagle/blob/0ecb7c1c/eagle-site/tutorial-topologymanagement.md
----------------------------------------------------------------------
diff --git a/eagle-site/tutorial-topologymanagement.md b/eagle-site/tutorial-topologymanagement.md
new file mode 100644
index 0000000..238ba3d
--- /dev/null
+++ b/eagle-site/tutorial-topologymanagement.md
@@ -0,0 +1,143 @@
+---
+layout: doc
+title:  "Topology Management"
+permalink: /docs/tutorial/topologymanagement.html
+---
+*Since Apache Eagle 0.4.0-incubating. Apache Eagle will be called Eagle in the following.*
+
+> Application manager aims to manage applications on EAGLE UI. Users can easily start/start
topologies remotely or locally without any shell commands. At the same, it should be capable
to sync the latest status of topologies on the execution platform (e.g., Storm[^STORM] cluster).

+
+This tutorial will go through all parts of application manager and then give an example to
use it. 
+
+### Design
+Application manager consists of a daemon scheduler and an execution module. The scheduler
periodically loads user operations(start/stop) from database, and the execution module executes
these operations. For more details, please refer to [here](https://cwiki.apache.org/confluence/display/EAG/Application+Management).
+
+### Configurations
+The configuration file `eagle-scheduler.conf` defines scheduler parameters, execution platform
settings and parts of default topology configuration.
+
+* **Scheduler properties**
+
+    <style>
+        table, td, th {
+            border-collapse: collapse;
+            border: 1px solid gray;
+            padding: 10px;
+        }
+    </style>
+    
+    
+    Property Name | Default  | Description  
+    ------------- | :-------------:   | -----------  
+    appCommandLoaderEnabled | false | topology management is enabled or not  
+    appCommandLoaderIntervalSecs | 1  | defines the interval of the scheduler loads commands
 
+    appHealthCheckIntervalSecs | 5  | define the interval of topology health checking, which
syncs the topology execution status from storm cluster to Eagle 
+   
+
+
+* **Execution platform properties**
+
+    Property Name | Default  | Description  
+    ------------- | :-------------   | -----------  
+    envContextConfig.env | storm | execution environment, only storm is supported
+    envContextConfig.url | http://sandbox.hortonworks.com:8744 | storm ui
+    envContextConfig.nimbusHost | sandbox.hortonworks.com | storm nimbus host
+    envContextConfig.nimbusThriftPort | 6627  | storm nimbus thrift port  
+    envContextConfig.jarFile | TODO  | storm fat jar path
+
+* **Topology default properties**
+    
+    Some default topology properties are defined here. 
+   
+  
+### Playbook
+
+1. Editing eagle-scheduler.conf, and start Eagle service
+
+        # enable application manager       
+        appCommandLoaderEnabled = true
+        
+        # provide jar path
+        envContextConfig.jarFile =
+        
+        # storm nimbus
+        envContextConfig.url = "http://sandbox.hortonworks.com:8744"
+        envContextConfig.nimbusHost = "sandbox.hortonworks.com"
+        
+        
+        
+   
+    For more configurations, please back to [Application Configuration](/docs/configuration.html).
<br />
+    After the configuration is ready, start Eagle service `bin/eagle-service.sh start`. 
+   
+2. Go to admin page 
+   ![admin-page](/images/appManager/admin-page.png)
+   ![topology-monitor](/images/appManager/topology-monitor.png)
+    
+3. Go to management page, and create a topology description. There are three required fields
+    * name: topology name
+    * type: topology type [CLASS, DYNAMIC]
+    * execution entry: either the class which implements interface TopologyExecutable or
eagle [DSL](https://github.com/apache/eagle/blob/master/eagle-assembly/src/main/conf/sandbox-hadoopjmx-pipeline.conf)
based topology definition
+   ![topology-description](/images/appManager/topology-description.png)
+   
+4. Back to monitoring page, and choose the site/application to deploy the topology 
+   ![topology-execution](/images/appManager/topology-execution.png)
+   
+5. Go to site page, and add topology configurations. 
+   
+   **NOTICE** topology configurations defined here are REQUIRED an extra prefix `.app`
+   
+   Blow are some example configurations for [site=sandbox, applicatoin=hbaseSecurityLog].

+   
+
+  
+        classification.hbase.zookeeper.property.clientPort=2181
+        classification.hbase.zookeeper.quorum=sandbox.hortonworks.com
+        
+        app.envContextConfig.env=storm
+        app.envContextConfig.mode=cluster
+        
+        app.dataSourceConfig.topic=sandbox_hbase_security_log
+        app.dataSourceConfig.zkConnection=sandbox.hortonworks.com:2181
+        app.dataSourceConfig.zkConnectionTimeoutMS=15000
+        app.dataSourceConfig.brokerZkPath=/brokers
+        app.dataSourceConfig.fetchSize=1048586
+        app.dataSourceConfig.transactionZKServers=sandbox.hortonworks.com
+        app.dataSourceConfig.transactionZKPort=2181
+        app.dataSourceConfig.transactionZKRoot=/consumers
+        app.dataSourceConfig.consumerGroupId=eagle.hbasesecurity.consumer
+        app.dataSourceConfig.transactionStateUpdateMS=2000
+        app.dataSourceConfig.deserializerClass=org.apache.eagle.security.hbase.parse.HbaseAuditLogKafkaDeserializer
+        
+        app.eagleProps.site=sandbox
+        app.eagleProps.application=hbaseSecurityLog
+        app.eagleProps.dataJoinPollIntervalSec=30
+        app.eagleProps.mailHost=some.mail.server
+        app.eagleProps.mailSmtpPort=25
+        app.eagleProps.mailDebug=true
+        app.eagleProps.eagleService.host=localhost
+        app.eagleProps.eagleService.port=9099
+        app.eagleProps.eagleService.username=admin
+        app.eagleProps.eagleService.password=secret
+    
+
+   ![topology-configuration-1](/images/appManager/topology-configuration-1.png)
+   ![topology-configuration-2](/images/appManager/topology-configuration-2.png)
+   
+6. Go to monitoring page, and start topologies
+   ![start-topology-1](/images/appManager/start-topology-1.png)
+   ![start-topology-2](/images/appManager/start-topology-2.png)
+   
+7. stop topologies on monitoring page
+   ![stop-topology-1](/images/appManager/stop-topology-1.png)
+   ![stop-topology-2](/images/appManager/stop-topology-2.png)
+   ![stop-topology-3](/images/appManager/stop-topology-3.png)
+
+
+
+
+---
+
+#### *Footnotes*
+
+[^STORM]:*All mentions of "storm" on this page represent Apache Storm.*
+

http://git-wip-us.apache.org/repos/asf/eagle/blob/0ecb7c1c/eagle-site/tutorial-userprofile.md
----------------------------------------------------------------------
diff --git a/eagle-site/tutorial-userprofile.md b/eagle-site/tutorial-userprofile.md
new file mode 100644
index 0000000..e3553b2
--- /dev/null
+++ b/eagle-site/tutorial-userprofile.md
@@ -0,0 +1,65 @@
+---
+layout: doc
+title:  "User Profile Tutorial"
+permalink: /docs/tutorial/userprofile.html
+---
+This document will introduce how to start the online processing on user profiles. Assume
Apache Eagle has been installed and [Eagle service](http://sandbox.hortonworks.com:9099/eagle-service)
+is started.
+
+### User Profile Offline Training
+
+* **Step 1**: Start Apache Spark if not started
+![Start Spark](/images/docs/start-spark.png)
+
+* **Step 2**: start offline scheduler
+
+	* Option 1: command line
+
+	      $ cd <eagle-home>/bin
+	      $ bin/eagle-userprofile-scheduler.sh --site sandbox start
+
+	* Option 2: start via Apache Ambari
+	![Click "ops"](/images/docs/offline-userprofile.png)
+
+* **Step 3**: generate a model
+
+	![Click "ops"](/images/docs/userProfile1.png)
+	![Click "Update Now"](/images/docs/userProfile2.png)
+	![Click "Confirm"](/images/docs/userProfile3.png)
+	![Check](/images/docs/userProfile4.png)
+
+### User Profile Online Detection
+
+Two options to start the topology are provided.
+
+* **Option 1**: command line
+
+	submit userProfiles topology if it's not on [topology UI](http://sandbox.hortonworks.com:8744)
+
+      $ bin/eagle-topology.sh --main org.apache.eagle.security.userprofile.UserProfileDetectionMain
--config conf/sandbox-userprofile-topology.conf start
+
+* **Option 2**: Apache Ambari
+	
+	![Online userProfiles](/images/docs/online-userprofile.png)
+
+### Evaluate User Profile in Sandbox
+
+1. Prepare sample data for ML training and validation sample data
+* a. Download following sample data to be used for training 
+	* [`user1.hdfs-audit.2015-10-11-00.txt`](/data/user1.hdfs-audit.2015-10-11-00.txt) 
+	* [`user1.hdfs-audit.2015-10-11-01.txt`](/data/user1.hdfs-audit.2015-10-11-01.txt)
+* b. Downlaod [`userprofile-validate.txt`](/data/userprofile-validate.txt)file which contains
data points that you can try to test the models
+
+2. Copy the files (downloaded in the previous step) into a location in sandbox 
+For example: `/usr/hdp/current/eagle/lib/userprofile/data/`
+3. Modify `<Eagle-home>/conf/sandbox-userprofile-scheduler.conf `
+update `training-audit-path` to set to the path for training data sample (the path you used
for Step 1.a)
+update detection-audit-path to set to the path for validation (the path you used for Step
1.b)
+4. Run ML training program from eagle UI
+5. Produce Apache Kafka data using the contents from validate file (Step 1.b)
+Run the command (assuming the eagle configuration uses Kafka topic `sandbox_hdfs_audit_log`)

+
+		./kafka-console-producer.sh --broker-list sandbox.hortonworks.com:6667 --topic sandbox_hdfs_audit_log
+
+6. Paste few lines of data from file validate onto kafka-console-producer 
+Check [http://localhost:9099/eagle-service/#/dam/alertList](http://localhost:9099/eagle-service/#/dam/alertList)
for generated alerts 

http://git-wip-us.apache.org/repos/asf/eagle/blob/0ecb7c1c/eagle-site/usecases.md
----------------------------------------------------------------------
diff --git a/eagle-site/usecases.md b/eagle-site/usecases.md
new file mode 100644
index 0000000..4172ad8
--- /dev/null
+++ b/eagle-site/usecases.md
@@ -0,0 +1,48 @@
+---
+layout: doc
+title:  "Use Cases"
+permalink: /docs/usecases.html
+---
+
+### Data Activity Monitoring
+
+* Data activity represents how user explores data provided by big data platforms. Analyzing
data activity and alerting for insecure access are fundamental requirements for securing enterprise
data. As data volume is increasing exponentially with Hadoop[^HADOOP], Hive[^HIVE], Spark[^SPARK]
technology, understanding data activities for every user becomes extremely hard,  let alone
to alert for a single malicious event in real time among petabytes streaming data per day.
+
+* Securing enterprise data starts from understanding data activities for every user. Apache
Eagle (called Eagle in the following) has integrated with many popular big data platforms
e.g. Hadoop, Hive, Spark, Cassandra[^CASSANDRA] etc. With Eagle user can browse data hierarchy,
mark sensitive data and then create comprehensive policy to alert for insecure data access.
+
+### Job Performance Analytics
+
+* Running map/reduce job is the most popular way people use to analyze data in Hadoop system.
Analyzing job performance and providing tuning suggestions are critical for Hadoop system
stability, job SLA and resource usage etc. 
+
+* Eagle analyzes job performance with two complementing approaches. First Eagle periodically
takes snapshots for all running jobs with YARN API, secondly Eagle continuously reads job
lifecycle events immediately after the job is completed. With the two approaches, Eagle can
analyze single job's trend, data skew problem, failure reasons etc. More interestingly, Eagle
can analyze whole Hadoop cluster's performance by taking into account all jobs.
+
+### Node Anomaly Detection
+
+* One of practical benefits from analyzing map/reduce job is node anomaly detection. Big
data platform like Hadoop may involve thousands of nodes for supporting multi-tenant jobs.
One bad node may not crash whole cluster thanks to failure tolerance design, but may affect
specific jobs and cause a lot of rescheduling, job delay and hurt stability of whole cluster
etc.
+
+* Eagle developed out-of-the-box algorithm to compare task failure ratio for each node in
a large cluster. If one node continues to fail running tasks, it may have potential issues,
sometimes one of its disks is full or fails etc. In a nutshell, if one node behaves very differently
from all other nodes within one large cluster, this node is anomalous and we should take action.
+
+### Cluster Performance Analytics
+
+* It is critical to understand why a cluster performs bad. Is that because of some crazy
jobs recently onboarded, or huge amount of tiny files, or namenode performance degrading?
+
+* Eagle in realtime calculates resource usage per minute out of individual jobs, e.g. CPU,
memory, HDFS IO bytes, HDFS IO numOps etc. and also collects namenode JMX metrics. Correlating
them together will easily help system adminstrator find root cause for cluster slowness.
+
+### Cluster Resource Usage Trend
+
+* YARN manages resource allocation through queue in a large Hadoop cluster. Cluster resource
usage is exactly reflected by overall queue usage.
+
+* Eagle in realtime collects queue statistics and provide insights of cluster resource usage.
+
+
+
+---
+
+#### *Footnotes*
+
+[^HADOOP]:*All mentions of "hadoop" on this page represent Apache Hadoop.*
+[^HIVE]:*All mentions of "hive" on this page represent Apache Hive.*
+[^SPARK]:*All mentions of "spark" on this page represent Apache Spark.*
+[^CASSANDRA]:*Apache Cassandra.*
+
+

http://git-wip-us.apache.org/repos/asf/eagle/blob/0ecb7c1c/eagle-site/user-profile-ml.md
----------------------------------------------------------------------
diff --git a/eagle-site/user-profile-ml.md b/eagle-site/user-profile-ml.md
new file mode 100644
index 0000000..5c0a6b8
--- /dev/null
+++ b/eagle-site/user-profile-ml.md
@@ -0,0 +1,22 @@
+---
+layout: doc
+title:  "User Profile Machine Learning" 
+permalink: /docs/user-profile-ml.html
+---
+
+Apache Eagle (called Eagle in the following) provides capabilities to define user activity
patterns or user profiles for Apache Hadoop users based on the user behavior in the platform.
The idea is to provide anomaly detection capability without setting hard thresholds in the
system. The user profiles generated by our system are modeled using machine-learning algorithms
and used for detection of anomalous user activities, where users’ activity pattern differs
from their pattern history. Currently Eagle uses two algorithms for anomaly detection: Eigen-Value
Decomposition and Density Estimation. The algorithms read data from HDFS audit logs, slice
and dice data, and generate models for each user in the system. Once models are generated,
Eagle uses the Apache Storm framework for near-real-time anomaly detection to determine if
current user activities are suspicious or not with respect to their model. The block diagram
below shows the current pipeline for user profile training and onli
 ne detection.
+
+![](/images/docs/userprofile-arch.png)
+
+Eagle online anomaly detection uses the Eagle policy framework, and the user profile is defined
as one of the policies in the system. The user profile policy is evaluated by a machine-learning
evaluator extended from the Eagle policy evaluator. Policy definition includes the features
that are needed for anomaly detection (same as the ones used for training purposes).
+
+A scheduler runs a Apache Spark based offline training program (to generate user profiles
or models) at a configurable time interval; currently, the training program generates new
models once every month.
+
+The following are some details on the algorithms.
+
+* **Density Estimation**: In this algorithm, the idea is to evaluate, for each user, a probability
density function from the observed training data sample. We mean-normalize a training dataset
for each feature. Normalization allows datasets to be on the same scale. In our probability
density estimation, we use a Gaussian distribution function as the method for computing probability
density. Features are conditionally independent of one another; therefore, the final Gaussian
probability density can be computed by factorizing each feature’s probability density. During
the online detection phase, we compute the probability of a user’s activity. If the probability
of the user performing the activity is below threshold (determined from the training program,
using a method called Mathews Correlation Coefficient), we signal anomaly alerts.
+* **Eigen-Value Decomposition**: Our goal in user profile generation is to find interesting
behavioral patterns for users. One way to achieve that goal is to consider a combination of
features and see how each one influences the others. When the data volume is large, which
is generally the case for us, abnormal patterns among features may go unnoticed due to the
huge number of normal patterns. As normal behavioral patterns can lie within very low-dimensional
subspace, we can potentially reduce the dimension of the dataset to better understand the
user behavior pattern. This method also reduces noise, if any, in the training dataset. Based
on the amount of variance of the data we maintain for a user, which is usually 95% for our
case, we seek to find the number of principal components k that represents 95% variance. We
consider first k principal components as normal subspace for the user. The remaining (n-k)
principal components are considered as abnormal subspace.
+
+During online anomaly detection, if the user behavior lies near normal subspace, we consider
the behavior to be normal. On the other hand, if the user behavior lies near the abnormal
subspace, we raise an alarm as we believe usual user behavior should generally fall within
normal subspace. We use the Euclidian distance method to compute whether a user’s current
activity is near normal or abnormal subspace.
+
+![](/images/docs/userprofile-model.png)


Mime
View raw message