metron-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ceste...@apache.org
Subject svn commit: r20299 [5/18] - in /release/metron/0.4.0: ./ site-book/ site-book/css/ site-book/images/ site-book/images/logos/ site-book/images/profiles/ site-book/img/ site-book/js/ site-book/metron-analytics/ site-book/metron-analytics/metron-maas-serv...
Date Wed, 05 Jul 2017 06:56:42 GMT
Added: release/metron/0.4.0/site-book/metron-analytics/metron-profiler-client/index.html
==============================================================================
--- release/metron/0.4.0/site-book/metron-analytics/metron-profiler-client/index.html (added)
+++ release/metron/0.4.0/site-book/metron-analytics/metron-profiler-client/index.html Wed Jul  5 06:56:42 2017
@@ -0,0 +1,800 @@
+<!DOCTYPE html>
+<!--
+ | Generated by Apache Maven Doxia at 2017-06-27
+ | Rendered using Apache Maven Fluido Skin 1.3.0
+-->
+<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
+  <head>
+    <meta charset="UTF-8" />
+    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
+    <meta name="Date-Revision-yyyymmdd" content="20170627" />
+    <meta http-equiv="Content-Language" content="en" />
+    <title>Metron &#x2013; Metron Profiler Client</title>
+    <link rel="stylesheet" href="../../css/apache-maven-fluido-1.3.0.min.css" />
+    <link rel="stylesheet" href="../../css/site.css" />
+    <link rel="stylesheet" href="../../css/print.css" media="print" />
+
+      
+    <script type="text/javascript" src="../../js/apache-maven-fluido-1.3.0.min.js"></script>
+
+                          
+        
+<script type="text/javascript">$( document ).ready( function() { $( '.carousel' ).carousel( { interval: 3500 } ) } );</script>
+          
+            </head>
+        <body class="topBarDisabled">
+          
+                
+                    
+    
+        <div class="container-fluid">
+          <div id="banner">
+        <div class="pull-left">
+                                    <a href="http://metron.apache.org/" id="bannerLeft">
+                                                                                                <img src="../../images/metron-logo.png"  alt="Apache Metron" width="148px" height="48px"/>
+                </a>
+                      </div>
+        <div class="pull-right">  </div>
+        <div class="clear"><hr/></div>
+      </div>
+
+      <div id="breadcrumbs">
+        <ul class="breadcrumb">
+                
+                    
+                              <li class="">
+                    <a href="http://www.apache.org" class="externalLink" title="Apache">
+        Apache</a>
+        </li>
+      <li class="divider ">/</li>
+            <li class="">
+                    <a href="http://metron.apache.org/" class="externalLink" title="Metron">
+        Metron</a>
+        </li>
+      <li class="divider ">/</li>
+            <li class="">
+                    <a href="../../index.html" title="Documentation">
+        Documentation</a>
+        </li>
+      <li class="divider ">/</li>
+        <li class="">Metron Profiler Client</li>
+        
+                
+                    
+                  <li id="publishDate" class="pull-right">Last Published: 2017-06-27</li> <li class="divider pull-right">|</li>
+              <li id="projectVersion" class="pull-right">Version: 0.4.0</li>
+            
+                            </ul>
+      </div>
+
+            
+      <div class="row-fluid">
+        <div id="leftColumn" class="span3">
+          <div class="well sidebar-nav">
+                
+                    
+                <ul class="nav nav-list">
+                    <li class="nav-header">User Documentation</li>
+                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
+      <li>
+    
+                          <a href="../../index.html" title="Metron">
+          <i class="icon-chevron-down"></i>
+        Metron</a>
+                    <ul class="nav nav-list">
+                      
+      <li>
+    
+                          <a href="../../Upgrading.html" title="Upgrading">
+          <i class="none"></i>
+        Upgrading</a>
+            </li>
+                                                                                                                                                                
+      <li>
+    
+                          <a href="../../metron-analytics/index.html" title="Analytics">
+          <i class="icon-chevron-down"></i>
+        Analytics</a>
+                    <ul class="nav nav-list">
+                      
+      <li>
+    
+                          <a href="../../metron-analytics/metron-maas-service/index.html" title="Maas-service">
+          <i class="none"></i>
+        Maas-service</a>
+            </li>
+                      
+      <li>
+    
+                          <a href="../../metron-analytics/metron-profiler/index.html" title="Profiler">
+          <i class="none"></i>
+        Profiler</a>
+            </li>
+                      
+      <li class="active">
+    
+            <a href="#"><i class="none"></i>Profiler-client</a>
+          </li>
+                                                                        
+      <li>
+    
+                          <a href="../../metron-analytics/metron-statistics/index.html" title="Statistics">
+          <i class="icon-chevron-right"></i>
+        Statistics</a>
+                  </li>
+              </ul>
+        </li>
+                                                                                                                                                                                                                                                                                                                                                                                    
+      <li>
+    
+                          <a href="../../metron-deployment/index.html" title="Deployment">
+          <i class="icon-chevron-right"></i>
+        Deployment</a>
+                  </li>
+                      
+      <li>
+    
+                          <a href="../../metron-docker/index.html" title="Docker">
+          <i class="none"></i>
+        Docker</a>
+            </li>
+                      
+      <li>
+    
+                          <a href="../../metron-interface/metron-config/index.html" title="Config">
+          <i class="none"></i>
+        Config</a>
+            </li>
+                      
+      <li>
+    
+                          <a href="../../metron-interface/metron-rest/index.html" title="Rest">
+          <i class="none"></i>
+        Rest</a>
+            </li>
+                                                                                                                                                                                                                                                
+      <li>
+    
+                          <a href="../../metron-platform/index.html" title="Platform">
+          <i class="icon-chevron-right"></i>
+        Platform</a>
+                  </li>
+                                                                                                            
+      <li>
+    
+                          <a href="../../metron-sensors/index.html" title="Sensors">
+          <i class="icon-chevron-right"></i>
+        Sensors</a>
+                  </li>
+              </ul>
+        </li>
+            </ul>
+                
+                    
+                
+          <hr class="divider" />
+
+           <div id="poweredBy">
+                            <div class="clear"></div>
+                            <div class="clear"></div>
+                            <div class="clear"></div>
+                             <a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy">
+        <img class="builtBy" alt="Built by Maven" src="../../images/logos/maven-feather.png" />
+      </a>
+                  </div>
+          </div>
+        </div>
+        
+                
+        <div id="bodyColumn"  class="span9" >
+                                  
+            <h1>Metron Profiler Client</h1>
+<p><a name="Metron_Profiler_Client"></a></p>
+<p>This project provides a client API for accessing the profiles generated by the <a href="../metron-profiler/index.html">Metron Profiler</a>. This includes both a Java API and Stellar API for accessing the profile data. The primary use case is to extract profile data for use during model scoring.</p>
+<div class="section">
+<h2><a name="Stellar_Client_API"></a>Stellar Client API</h2>
+<div class="section">
+<h3><a name="PROFILE_GET"></a><tt>PROFILE_GET</tt></h3>
+<p>The <tt>PROFILE_GET</tt> command allows you to select all of the profile measurements written. This command takes the following arguments:</p>
+
+<div class="source">
+<div class="source">
+<pre>REQUIRED:
+    profile - The name of the profile
+    entity - The name of the entity
+    periods - The list of profile periods to grab.  These are ProfilePeriod objects.
+OPTIONAL:
+	groups_list - Optional, must correspond to the 'groupBy' list used in profile creation - List (in square brackets) of 
+            groupBy values used to filter the profile. Default is the empty list, meaning groupBy was not used when 
+            creating the profile.
+    config_overrides - Optional - Map (in curly braces) of name:value pairs, each overriding the global config parameter
+            of the same name. Default is the empty Map, meaning no overrides.
+</pre></div></div>
+<p>There is an older calling format where <tt>groups_list</tt> is specified as a sequence of group names, &#x201c;varargs&#x201d; style, instead of a List object. This format is still supported for backward compatibility, but it is deprecated, and it is disallowed if the optional <tt>config_overrides</tt> argument is used.</p>
+<p>The <tt>periods</tt> field is (likely) the output of another Stellar function which defines the times to include.</p>
+<div class="section">
+<h4><a name="Groups_list_argument"></a>Groups_list argument</h4>
+<p>The <tt>groups_list</tt> argument in the client must exactly correspond to the <a href="../metron-profiler/index.html#groupBy"><tt>groupBy</tt></a> configuration in the profile definition. If <tt>groupBy</tt> was not used in the profile, <tt>groups_list</tt> must be empty in the client. If <tt>groupBy</tt> was used in the profile, then the client <tt>groups_list</tt> is <b>not</b> optional; it must be the same length as the <tt>groupBy</tt> list, and specify exactly one selected group value for each <tt>groupBy</tt> criterion, in the same order. For example:</p>
+
+<div class="source">
+<div class="source">
+<pre>If in Profile, the groupBy criteria are:  [ &#x201c;DAY_OF_WEEK()&#x201d;, &#x201c;URL_TO_PORT()&#x201d; ]
+Then in PROFILE_GET, an allowed groups value would be:  [ &#x201c;3&#x201d;, &#x201c;8080&#x201d; ]
+which will select only records from Tuesdays with port number 8080.
+</pre></div></div></div>
+<div class="section">
+<h4><a name="Configuration_and_the_config_overrides_argument"></a>Configuration and the config_overrides argument</h4>
+<p>By default, the Profiler creates profiles with a period duration of 15 minutes. This means that data is accumulated, summarized and flushed every 15 minutes. The Client API must also have knowledge of this duration to correctly retrieve the profile data. If the Client is expecting 15 minute periods, it will not be able to read data generated by a Profiler that was configured for 1 hour periods, and will return zero results. </p>
+<p>Similarly, all six Client configuration parameters listed in the table below must match the Profiler configuration parameter settings from the time the profile was created. The period duration and other configuration parameters from the Profiler topology are stored in local filesystem at <tt>$METRON_HOME/config/profiler.properties</tt>. The Stellar Client API can be configured correspondingly by setting the following properties in Metron&#x2019;s global configuration, on local filesystem at <tt>$METRON_HOME/config/zookeeper/global.json</tt>, then uploaded to Zookeeper (at <tt>/metron/topology/global</tt>) by using <tt>zk_load_configs.sh</tt>: </p>
+
+<div class="source">
+<div class="source">
+<pre>    $ cd $METRON_HOME
+    $ bin/zk_load_configs.sh -m PUSH -i config/zookeeper/ -z node1:2181
+</pre></div></div>
+<p>Any of these six Client configuration parameters may be overridden at run time using the <tt>config_overrides</tt> Map argument in PROFILE_GET. The primary use case is when historical profiles have been created with a different Profiler configuration than is currently configured, and the analyst needing to access them does not want to change the global Client configuration so as not to disrupt the work of other analysts working with current profiles.</p>
+
+<table border="0" class="table table-striped">
+  <thead>
+    
+<tr class="a">
+      
+<th>Key </th>
+      
+<th>Description </th>
+      
+<th>Required </th>
+      
+<th>Default </th>
+    </tr>
+  </thead>
+  <tbody>
+    
+<tr class="b">
+      
+<td>profiler.client.period.duration </td>
+      
+<td>The duration of each profile period. This value should be defined along with <tt>profiler.client.period.duration.units</tt>. </td>
+      
+<td>Optional </td>
+      
+<td>15 </td>
+    </tr>
+    
+<tr class="a">
+      
+<td>profiler.client.period.duration.units </td>
+      
+<td>The units used to specify the profile period duration. This value should be defined along with <tt>profiler.client.period.duration</tt>. </td>
+      
+<td>Optional </td>
+      
+<td>MINUTES </td>
+    </tr>
+    
+<tr class="b">
+      
+<td>profiler.client.hbase.table </td>
+      
+<td>The name of the HBase table used to store profile data. </td>
+      
+<td>Optional </td>
+      
+<td>profiler </td>
+    </tr>
+    
+<tr class="a">
+      
+<td>profiler.client.hbase.column.family </td>
+      
+<td>The name of the HBase column family used to store profile data. </td>
+      
+<td>Optional </td>
+      
+<td>P </td>
+    </tr>
+    
+<tr class="b">
+      
+<td>profiler.client.salt.divisor </td>
+      
+<td>The salt divisor used to store profile data. </td>
+      
+<td>Optional </td>
+      
+<td>1000 </td>
+    </tr>
+    
+<tr class="a">
+      
+<td>hbase.provider.impl </td>
+      
+<td>The name of the HBaseTableProvider implementation class. </td>
+      
+<td>Optional </td>
+      
+<td> </td>
+    </tr>
+  </tbody>
+</table></div></div>
+<div class="section">
+<h3><a name="Profile_Selectors"></a>Profile Selectors</h3>
+<p>You will notice that the third argument for <tt>PROFILE_GET</tt> is a list of <tt>ProfilePeriod</tt> objects. This list is expected to be produced by another Stellar function. There are a couple options available.</p>
+<div class="section">
+<h4><a name="PROFILE_FIXED"></a><tt>PROFILE_FIXED</tt></h4>
+<p>The profiler periods associated with a fixed lookback starting from now. These are ProfilePeriod objects.</p>
+
+<div class="source">
+<div class="source">
+<pre>REQUIRED:
+    durationAgo - How long ago should values be retrieved from?
+    units - The units of 'durationAgo'.
+OPTIONAL:
+    config_overrides - Optional - Map (in curly braces) of name:value pairs, each overriding the global config parameter
+            of the same name. Default is the empty Map, meaning no overrides.
+
+e.g. To retrieve all the profiles for the last 5 hours.  PROFILE_GET('profile', 'entity', PROFILE_FIXED(5, 'HOURS'))
+</pre></div></div>
+<p>Note that the <tt>config_overrides</tt> parameter operates exactly as the <tt>config_overrides</tt> argument in <tt>PROFILE_GET</tt>. The only available parameters for override are:</p>
+
+<ul>
+  
+<li><tt>profiler.client.period.duration</tt></li>
+  
+<li><tt>profiler.client.period.duration.units</tt></li>
+</ul></div>
+<div class="section">
+<h4><a name="PROFILE_WINDOW"></a><tt>PROFILE_WINDOW</tt></h4>
+<p><tt>PROFILE_WINDOW</tt> is intended to provide a finer-level of control over selecting windows for profiles:</p>
+
+<ul>
+  
+<li>Specify windows relative to the data timestamp (see the optional <tt>now</tt> parameter below)</li>
+  
+<li>Specify non-contiguous windows to better handle seasonal data (e.g. the last hour for every day for the last month)</li>
+  
+<li>Specify profile output excluding holidays</li>
+  
+<li>Specify only profile output on a specific day of the week</li>
+</ul>
+<p>It does this by a domain specific language mimicking natural language that defines the windows excluded.</p>
+
+<div class="source">
+<div class="source">
+<pre>REQUIRED:
+    windowSelector - The statement specifying the window to select.
+    now - Optional - The timestamp to use for now.
+OPTIONAL:
+    config_overrides - Optional - Map (in curly braces) of name:value pairs, each overriding the global config parameter
+            of the same name. Default is the empty Map, meaning no overrides.
+
+e.g. To retrieve all the measurements written for 'profile' and 'entity' for the last hour 
+on the same weekday excluding weekends and US holidays across the last 14 days: 
+PROFILE_GET('profile', 'entity', PROFILE_WINDOW('1 hour window every 24 hours starting from 14 days ago including the current day of the week excluding weekends, holidays:us'))
+</pre></div></div>
+<p>Note that the <tt>config_overrides</tt> parameter operates exactly as the <tt>config_overrides</tt> argument in <tt>PROFILE_GET</tt>. The only available parameters for override are:</p>
+
+<ul>
+  
+<li><tt>profiler.client.period.duration</tt></li>
+  
+<li><tt>profiler.client.period.duration.units</tt></li>
+</ul>
+<div class="section">
+<h5><a name="The_Profile_Selector_Language"></a>The Profile Selector Language</h5>
+<p>The domain specific language can be broken into a series of clauses, some optional</p>
+
+<ul>
+  
+<li><a href="#Temporal_Window_Width"><span style="color:blue">Total Temporal Duration</span></a> - The total range of time in which windows may be specified</li>
+  
+<li><a href="#InclusionExclusion_specifiers"><span style="color:red">Temporal Window Width</span></a> - How large each temporal window</li>
+  
+<li><a href="#Skip_distance"><span style="color:green">Skip distance</span></a> (optional)- How far to skip between when one window starts and when the next begins</li>
+  
+<li><a href="#InclusionExclusion_specifiers"><span style="color:purple">Inclusion/Exclusion specifiers</span></a> (optional) - The set of specifiers to further filter the window</li>
+</ul>
+<p>One <i>must</i> specify either a total temporal duration or a temporal window width. The remaining clauses are optional. During the course of the following discussion, we will color code the clauses in the examples and link them to the relevant section for more detail.</p>
+<p>From a high level, the language fits the following three forms, which are composed of the clauses above:</p>
+
+<ul>
+  
+<li><a href="#Temporal_Window_Width"><span style="color:red">time_interval WINDOW?</span></a> <a href="#InclusionExclusion_specifiers"><span style="color:purple">(INCLUDING specifier_list)? (EXCLUDING specifier_list)?</span></a></li>
+  
+<li><a href="#Temporal_Window_Width"><span style="color:red">time_interval WINDOW?</span></a> <a href="#Skip_distance"><span style="color:green">EVERY time_interval</span></a> <a href="#Total_Temporal_Duration"><span style="color:blue">FROM time_interval (TO time_interval)?</span></a> <a href="#InclusionExclusion_specifiers"><span style="color:purple">(INCLUDING specifier_list)? (EXCLUDING specifier_list)?</span></a></li>
+  
+<li><a href="#Total_Temporal_Duration"><span style="color:blue">FROM time_interval (TO time_interval)?</span></a></li>
+</ul>
+<div class="section">
+<h6><a name="Total_Temporal_Duration"></a><span style="color:blue">Total Temporal Duration</span></h6>
+<p>Total temporal duration is specified by a phrase: <tt>FROM time_interval AGO TO time_interval AGO</tt> This indicates the beginning and ending of a time interval. This is an inclusive duration.</p>
+
+<ul>
+  
+<li><tt>FROM</tt> - Can be the words &#x201c;from&#x201d; or &#x201c;starting from&#x201d;</li>
+  
+<li><tt>time_interval</tt> - A time amount followed by a unit (e.g. 1 hour). Fractional amounts are not supported. The unit may be &#x201c;minute&#x201d;, &#x201c;day&#x201d;, &#x201c;hour&#x201d; with any pluralization.</li>
+  
+<li><tt>TO</tt> - Can be the words &#x201c;until&#x201d; or &#x201c;to&#x201d;</li>
+  
+<li><tt>AGO</tt> - Optionally the word &#x201c;ago&#x201d;</li>
+</ul>
+<p>The <tt>TO time_interval AGO</tt> portion is optional. If unspecified then it is expected that the time interval ends now.</p>
+<p>Due to the vagaries of the english language, the from and the to portions, if both specified, are interchangeable with regard to which one specifies the start and which specifies the end. </p>
+<p>In other words &#x201c;<a href="#Total_Temporal_Duration"><span style="color:blue">starting from 1 hour ago to 30 minutes ago</span></a>&#x201d; and &#x201c;<a href="#Total_Temporal_Duration"><span style="color:blue">starting from 30 minutes ago to 1 hour ago</span></a>&#x201d; specify the same temporal duration.</p>
+<p><b>Examples</b></p>
+
+<ul>
+  
+<li>A duration starting 1 hour ago and ending now
+  
+<ul>
+    
+<li><a href="#Total_Temporal_Duration"><span style="color:blue">from 1 hour ago</span></a></li>
+    
+<li><a href="#Total_Temporal_Duration"><span style="color:blue">from 1 hour</span></a></li>
+    
+<li><a href="#Total_Temporal_Duration"><span style="color:blue">starting from 1 hour ago</span></a></li>
+    
+<li><a href="#Total_Temporal_Duration"><span style="color:blue">starting from 1 hour</span></a></li>
+  </ul></li>
+  
+<li>A duration starting 1 hour ago and ending 30 minutes ago:
+  
+<ul>
+    
+<li><a href="#Total_Temporal_Duration"><span style="color:blue">from 1 hour ago until 30 minutes ago</span></a></li>
+    
+<li><a href="#Total_Temporal_Duration"><span style="color:blue">from 30 minutes ago until 1 hour ago</span></a></li>
+    
+<li><a href="#Total_Temporal_Duration"><span style="color:blue">starting from 1 hour ago to 30 minutes ago</span></a></li>
+    
+<li><a href="#Total_Temporal_Duration"><span style="color:blue">starting from 1 hour to 30 minutes</span></a></li>
+  </ul></li>
+</ul></div>
+<div class="section">
+<h6><a name="Temporal_Window_Width"></a><span style="color:red">Temporal Window Width</span></h6>
+<p>Temporal window width is the specification of a window. A window is may either repeat within total temporal duration or may fill the total temporal duration. This is an inclusive window. A window is specified by the phrase: <tt>time_interval WINDOW</tt></p>
+
+<ul>
+  
+<li><tt>time_interval</tt> - A time amount followed by a unit (e.g. 1 hour). Fractional amounts are not supported. The unit may be &#x201c;minute&#x201d;, &#x201c;day&#x201d;, &#x201c;hour&#x201d; with any pluralization.</li>
+  
+<li><tt>WINDOW</tt> - Optionally the word &#x201c;window&#x201d;</li>
+</ul>
+<p><b>Examples</b></p>
+
+<ul>
+  
+<li>A fixed window starting 2 hours ago and going until now
+  
+<ul>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">2 hour</span></a></li>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">2 hours</span></a></li>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">2 hours window</span></a></li>
+  </ul></li>
+  
+<li>A repeating 30 minute window starting 2 hours ago and repeating every hour until now. This would result in 2 30-minute wide windows: 2 hours ago and 1 hour ago
+  
+<ul>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">30 minute window</span></a> <a href="#Skip_distance"><span style="color:green">every 1 hour</span></a> <a href="#Total_Temporal_Duration"><span style="color:blue">starting from 2 hours ago</span></a></li>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">30 minute windows</span></a> <a href="#Skip_distance"><span style="color:green">every 1 hour</span></a> <a href="#Total_Temporal_Duration"><span style="color:blue">from 2 hours ago</span></a></li>
+  </ul></li>
+  
+<li>A repeating 30 minute window starting 2 hours ago and repeating every hour until 30 minutes ago. This would result in 2 30-minute wide windows: 2 hours ago and 1 hour ago
+  
+<ul>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">30 minute window</span></a> <a href="#Skip_distance"><span style="color:green">every 1 hour</span></a> <a href="#Total_Temporal_Duration"><span style="color:blue">starting from 2 hours ago until 30 minutes ago</span></a></li>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">30 minutes window</span></a> <a href="#Skip_distance"><span style="color:green">every 1 hour</span></a> <a href="#Total_Temporal_Duration"><span style="color:blue">from 2 hours ago to 30 minutes ago</span></a></li>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">30 minutes window</span></a> <a href="#Skip_distance"><span style="color:green">for every 1 hour</span></a> <a href="#Total_Temporal_Duration"><span style="color:blue">from 30 minutes ago to 2 hours ago</span></a></li>
+  </ul></li>
+</ul></div>
+<div class="section">
+<h6><a name="Skip_distance"></a><span style="color:green">Skip distance</span></h6>
+<p>Skip distance is the amount of time between temporal window beginnings that the next window starts. It is, in effect, the window period. </p>
+<p>It is specified by the phrase <tt>EVERY time_interval</tt></p>
+
+<ul>
+  
+<li><tt>time_interval</tt> - A time amount followed by a unit (e.g. 1 hour). Fractional amounts are not supported. The unit may be &#x201c;minute&#x201d;, &#x201c;day&#x201d;, &#x201c;hour&#x201d; with any pluralization.</li>
+  
+<li><tt>EVERY</tt> - The word/phrase &#x201c;every&#x201d; or &#x201c;for every&#x201d;</li>
+</ul>
+<p><b>Examples</b></p>
+
+<ul>
+  
+<li>A repeating 30 minute window starting 2 hours ago and repeating every hour until now. This would result in 2 30-minute wide windows: 2 hours ago and 1 hour ago
+  
+<ul>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">30 minute window</span></a> <a href="#Skip_distance"><span style="color:green">every 1 hour</span></a> <a href="#Total_Temporal_Duration"><span style="color:blue">starting from 2 hours ago </span></a></li>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">30 minutes window</span></a> <a href="#Skip_distance"><span style="color:green">every 1 hour</span></a> <a href="#Total_Temporal_Duration"><span style="color:blue">from 2 hours ago </span></a></li>
+  </ul></li>
+  
+<li>A repeating 30 minute window starting 2 hours ago and repeating every hour until 30 minutes ago. This would result in 2 30-minute wide windows: 2 hours ago and 1 hour ago
+  
+<ul>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">30 minute window</span></a> <a href="#Skip_distance"><span style="color:green">every 1 hour</span></a> <a href="#Total_Temporal_Duration"><span style="color:blue">starting from 2 hours ago until 30 minutes ago</span></a></li>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">30 minutes window</span></a> <a href="#Skip_distance"><span style="color:green">every 1 hour</span></a> <a href="#Total_Temporal_Duration"><span style="color:blue">from 2 hours ago to 30 minutes ago</span></a></li>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">30 minutes window</span></a> <a href="#Skip_distance"><span style="color:green">for every 1 hour</span></a> <a href="#Total_Temporal_Duration"><span style="color:blue">from 30 minutes ago to 2 hours ago</span></a></li>
+  </ul></li>
+</ul></div>
+<div class="section">
+<h6><a name="InclusionExclusion_specifiers"></a><span style="color:purple">Inclusion/Exclusion specifiers</span></h6>
+<p>Inclusion and Exclusion specifiers operate as filters on the set of windows. They operate on the window beginning timestamp.</p>
+<p>For inclusion specifiers, windows who are passed by <i>any</i> of the set of inclusion specifiers are included.<br />inclusion specifiers. Similarly, windows who are passed by <i>any</i> of the set of exclusion specifiers are excluded. Exclusion specifiers trump inclusion specifiers.</p>
+<p>Specifiers follow one of the following formats depending on if it is an inclusion or exclusion specifier:</p>
+
+<ul>
+  
+<li><tt>INCLUSION specifier, specifier, ...</tt>
+  
+<ul>
+    
+<li><tt>INCLUSION</tt> can be &#x201c;include&#x201d;, &#x201c;includes&#x201d; or &#x201c;including&#x201d;</li>
+  </ul></li>
+  
+<li><tt>EXCLUSION specifier, specifier, ...</tt>
+  
+<ul>
+    
+<li><tt>EXCLUSION</tt> can be &#x201c;exclude&#x201d;, &#x201c;excludes&#x201d; or &#x201c;excluding&#x201d;</li>
+  </ul></li>
+</ul>
+<p>The specifiers are a set of fixed specifiers available as part of the language:</p>
+
+<ul>
+  
+<li>Fixed day of week-based specifiers - includes or excludes if the window is on the specified day of the week
+  
+<ul>
+    
+<li>&#x201c;monday&#x201d; or &#x201c;mondays&#x201d;</li>
+    
+<li>&#x201c;tuesday&#x201d; or &#x201c;tuesdays&#x201d;</li>
+    
+<li>&#x201c;wednesday&#x201d; or &#x201c;wednesdays&#x201d;</li>
+    
+<li>&#x201c;thursday&#x201d; or &#x201c;thursdays&#x201d;</li>
+    
+<li>&#x201c;friday&#x201d; or &#x201c;fridays&#x201d;</li>
+    
+<li>&#x201c;saturday&#x201d; or &#x201c;saturdays&#x201d;</li>
+    
+<li>&#x201c;sunday&#x201d; or &#x201c;sundays&#x201d;</li>
+    
+<li>&#x201c;weekday&#x201d; or &#x201c;weekdays&#x201d;</li>
+    
+<li>&#x201c;weekend&#x201d; or &quot;&#x201c;weekends&#x201d;</li>
+  </ul></li>
+  
+<li>Relative day of week-based specifiers - includes or excludes based on the day of week relative to now
+  
+<ul>
+    
+<li>&#x201c;current day of the week&#x201d;</li>
+    
+<li>&#x201c;current day of week&#x201d;</li>
+    
+<li>&#x201c;this day of the week&#x201d;</li>
+    
+<li>&#x201c;this day of week&#x201d;</li>
+  </ul></li>
+  
+<li>Specified date - includes or excludes based on the specified date
+  
+<ul>
+    
+<li>&#x201c;date&#x201d; - Takes up to 2 arguments
+    
+<ul>
+      
+<li>The day in <tt>yyyy/MM/dd</tt> format if no second argument is provided</li>
+      
+<li>Optionally the format to specify the first argument in</li>
+      
+<li>Example: <tt>date:2017/12/25</tt> would include or exclude December 25, 2017</li>
+      
+<li>Example: <tt>date:20171225:yyyyMMdd</tt> would include or exclude December 25, 2017</li>
+    </ul></li>
+  </ul></li>
+  
+<li>Holidays - includes or excludes based on if the window starts during a holiday
+  
+<ul>
+    
+<li>&#x201c;holiday&#x201d; or &#x201c;holidays&#x201d;
+    
+<ul>
+      
+<li>Arguments form the jollyday hierarchy of holidays. e.g. &#x201c;us:nyc&#x201d; would be holidays for New York City, USA</li>
+      
+<li>If none is specified, it will choose based on locale.</li>
+      
+<li>Countries supported are those supported in <a class="externalLink" href="https://github.com/svendiedrichsen/jollyday/tree/master/src/main/resources/holidays">jollyday</a></li>
+      
+<li>Example: <tt>holiday:us:nyc</tt> would be the holidays of New York City, USA</li>
+      
+<li>Example: <tt>holiday:hu</tt> would be the holidays of Hungary</li>
+    </ul></li>
+  </ul></li>
+</ul>
+<p><b>WARNING: Daylight Savings Time effects</b></p>
+<p>While Universal Time (UTC) is nice and constant, many servers are set to local timezones that enable Daylight Savings Time (DST). This means that twice a year, on DST transition weekends, &#x201c;Sunday&#x201d; is either 23 or 25 hours long. However, durations specified as &#x201c;7 days ago&#x201d; are always interpreted as &#x201c;7*24 hours ago&#x201d;. This can lead to some surprising effects when using days of the week as inclusion or exclusion specifiers.</p>
+<p>For example, the profile window specified by the phrase &#x201c;30 minute window every 24 hours from 7 days ago&#x201d; will always have 7 thirty-minute intervals, and these will normally occur on 5 weekdays and 2 weekend days. However, if you invoke this window at 12:15am any day during the week following the start of DST, you will get these intervals (supposing you start early on a Wednesday morning):</p>
+
+<div class="source">
+<div class="source">
+<pre>Tuesday 12:15am-12:45am (yesterday)
+Monday 12:15am-12:45am
+Saturday 11:15pm-11:45pm (skipped Sunday!)
+Friday 11:15pm-11:45pm
+Thursday 11:15pm-11:45pm
+Wednesday 11:15pm-11:45pm
+Tuesday 11:15pm-11:45pm
+</pre></div></div>
+<p>Sunday got skipped over because it was only 23 hours long; that is, there were 24 hours between Saturday 11:15pm and Monday 12:15am. So if you specified &#x201c;excluding weekends&#x201d;, you would get 6 days&#x2019; intervals instead of the expected 5. There are multiple variations on this theme.</p>
+<p>Remember that the underlying time is kept in UTC, so the data is always correct. It is only when attempting to interpret UTC as local time, date, and day, that these confusions may occur. They may be eliminated by setting your server timezone to UTC, or otherwise disabling DST.</p>
+<p><b>Examples</b></p>
+<p>Assume these are executed at noon.</p>
+
+<ul>
+  
+<li>A 1 hour window for the past 8 &#x2018;current day of the week&#x2019;
+  
+<ul>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">1 hour window</span></a> <a href="#Skip_distance"><span style="color:green">every 24 hours</span></a> <a href="#Total_Temporal_Duration"><span style="color:blue">from 56 days ago</span></a> <a href="#InclusionExclusion_specifiers"><span style="color:purple">including this day of the week</span></a></li>
+  </ul></li>
+  
+<li>A 1 hour window for the past 8 tuesdays
+  
+<ul>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">1 hour window</span></a> <a href="#Skip_distance"><span style="color:green">every 24 hours</span></a> <a href="#Total_Temporal_Duration"><span style="color:blue">from 56 days ago</span></a> <a href="#InclusionExclusion_specifiers"><span style="color:purple">including tuesdays</span></a></li>
+  </ul></li>
+  
+<li>A 30 minute window every tuesday at noon starting 14 days ago until now
+  
+<ul>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">30 minute window</span></a> <a href="#Skip_distance"><span style="color:green">every 24 hours</span></a> <a href="#Total_Temporal_Duration"><span style="color:blue">from 14 days ago</span></a> <a href="#InclusionExclusion_specifiers"><span style="color:purple">including tuesdays</span></a></li>
+  </ul></li>
+  
+<li>A 30 minute window every day except holidays and weekends at noon starting 14 days ago until now
+  
+<ul>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">30 minute window</span></a> <a href="#Skip_distance"><span style="color:green">every 24 hours</span></a> <a href="#Total_Temporal_Duration"><span style="color:blue">from 14 days ago</span></a> <a href="#InclusionExclusion_specifiers"><span style="color:purple">excluding holidays:us, weekends</span></a></li>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">30 minute window</span></a> <a href="#Skip_distance"><span style="color:green">every 24 hours</span></a> <a href="#Total_Temporal_Duration"><span style="color:blue">from 14 days ago</span></a> <a href="#InclusionExclusion_specifiers"><span style="color:purple">including weekdays excluding holidays:us, weekends</span></a></li>
+  </ul></li>
+  
+<li>A 30 minute window at noon every day from 7 days ago including saturdays and excluding weekends. Because exclusions trump inclusions, the following will never yield any windows
+  
+<ul>
+    
+<li><a href="#Temporal_Window_Width"><span style="color:red">30 minute window</span></a> <a href="#Skip_distance"><span style="color:green">every 24 hours</span></a> <a href="#Total_Temporal_Duration"><span style="color:blue">from 7 days ago</span></a> <a href="#InclusionExclusion_specifiers"><span style="color:purple">including saturdays excluding weekends</span></a></li>
+  </ul></li>
+</ul></div></div></div></div>
+<div class="section">
+<h3><a name="Errors"></a>Errors</h3>
+<p>The most common result of incorrect <tt>PROFILE_GET</tt> arguments or Client configuration parameters is an empty result set, rather than an error. The Client cannot effectively validate the arguments, because the Profiler configuration parameters may be changed and the profile itself does not store them. The person doing the querying must carry forward the knowledge of the Profiler configuration parameters from the time of profile creation, and use corresponding <tt>PROFILE_GET</tt> arguments and Client configuration parameters when querying the data.</p></div>
+<div class="section">
+<h3><a name="Examples"></a>Examples</h3>
+<p>The following are usage examples that show how the Stellar API can be used to read profiles generated by the <a href="../metron-profiler/index.html">Metron Profiler</a>. This API would be used in conjunction with other Stellar functions like <a href="../../metron-platform/metron-common/index.html#MAAS_MODEL_APPLY"><tt>MAAS_MODEL_APPLY</tt></a> to perform model scoring on streaming data.</p>
+<p>These examples assume a profile has been defined called &#x2018;snort-alerts&#x2019; that tracks the number of Snort alerts associated with an IP address over time. The profile definition might look similar to the following.</p>
+
+<div class="source">
+<div class="source">
+<pre>{
+  &quot;profiles&quot;: [
+    {
+      &quot;profile&quot;: &quot;snort-alerts&quot;,
+      &quot;foreach&quot;: &quot;ip_src_addr&quot;,
+      &quot;onlyif&quot;:  &quot;source.type == 'snort'&quot;,
+      &quot;update&quot;:  { &quot;s&quot;: &quot;STATS_ADD(s, 1)&quot; },
+      &quot;result&quot;:  &quot;STATS_MEAN(s)&quot;
+    }
+  ]
+}
+</pre></div></div>
+<p>During model scoring the entity being scored, in this case a particular IP address, will be known. The following examples shows how this profile data might be retrieved. Retrieve all values of &#x2018;snort-alerts&#x2019; from &#x2018;10.0.0.1&#x2019; over the past 4 hours.</p>
+
+<div class="source">
+<div class="source">
+<pre>PROFILE_GET('snort-alerts', '10.0.0.1', PROFILE_FIXED(4, 'HOURS'))
+</pre></div></div>
+<p>Retrieve all values of &#x2018;snort-alerts&#x2019; from &#x2018;10.0.0.1&#x2019; over the past 2 days.</p>
+
+<div class="source">
+<div class="source">
+<pre>PROFILE_GET('snort-alerts', '10.0.0.1', PROFILE_FIXED(2, 'DAYS'))
+</pre></div></div>
+<p>If the profile had been defined to group the data by weekday versus weekend, then the following example would apply:</p>
+<p>Retrieve all values of &#x2018;snort-alerts&#x2019; from &#x2018;10.0.0.1&#x2019; that occurred on &#x2018;weekdays&#x2019; over the past 30 days.</p>
+
+<div class="source">
+<div class="source">
+<pre>PROFILE_GET('snort-alerts', '10.0.0.1', PROFILE_FIXED(30, 'DAYS'), ['weekdays'] )
+</pre></div></div>
+<p>The client may need to use a configuration different from the current Client configuration settings. For example, perhaps you are on a cluster shared with other analysts, and need to access a profile that was constructed 2 months ago using different period duration, while they are accessing more recent profiles constructed with the currently configured period duration. For this situation, you may use the <tt>config_overrides</tt> argument:</p>
+<p>Retrieve all values of &#x2018;snort-alerts&#x2019; from &#x2018;10.0.0.1&#x2019; over the past 2 days, with no <tt>groupBy</tt>, and overriding the usual global client configuration parameters for window duration.</p>
+
+<div class="source">
+<div class="source">
+<pre>PROFILE_GET('profile1', 'entity1', PROFILE_FIXED(2, 'DAYS', {'profiler.client.period.duration' : '2', 'profiler.client.period.duration.units' : 'MINUTES'}), [])
+</pre></div></div>
+<p>Retrieve all values of &#x2018;snort-alerts&#x2019; from &#x2018;10.0.0.1&#x2019; that occurred on &#x2018;weekdays&#x2019; over the past 30 days, overriding the usual global client configuration parameters for window duration.</p>
+
+<div class="source">
+<div class="source">
+<pre>PROFILE_GET('profile1', 'entity1', PROFILE_FIXED(30, 'DAYS', {'profiler.client.period.duration' : '2', 'profiler.client.period.duration.units' : 'MINUTES'}), ['weekdays'] )
+</pre></div></div></div></div>
+<div class="section">
+<h2><a name="Getting_Started"></a>Getting Started</h2>
+<p>These instructions step through the process of using the Stellar Client API on a live cluster. These instructions assume that the &#x2018;Getting Started&#x2019; instructions included with the <a href="../metron-profiler/index.html">Metron Profiler</a> have been followed. This will create a Profile called &#x2018;test&#x2019; whose data will be retrieved with the Stellar Client API.</p>
+<p>To validate that everything is working, login to the server hosting Metron. We will use the Stellar Shell to replicate the execution environment of Stellar running in a Storm topology, like Metron&#x2019;s Parser or Enrichment topology. Replace &#x2018;node1:2181&#x2019; with the URL to a Zookeeper Broker. </p>
+
+<div class="source">
+<div class="source">
+<pre>[root@node1 0.3.1]# bin/stellar -z node1:2181
+Stellar, Go!
+Please note that functions are loading lazily in the background and will be unavailable until loaded fully.
+{es.clustername=metron, es.ip=node1, es.port=9300, es.date.format=yyyy.MM.dd.HH}
+
+[Stellar]&gt;&gt;&gt; ?PROFILE_GET
+Functions loaded, you may refer to functions now...
+PROFILE_GET
+Description: Retrieves a series of values from a stored profile.
+
+Arguments:
+	profile - The name of the profile.
+	entity - The name of the entity.
+	durationAgo - How long ago should values be retrieved from?
+	units - The units of 'durationAgo'.
+	groups_list - Optional, must correspond to the 'groupBy' list used in profile creation - List (in square brackets) of 
+            groupBy values used to filter the profile. Default is the empty list, meaning groupBy was not used when 
+            creating the profile.
+	config_overrides - Optional - Map (in curly braces) of name:value pairs, each overriding the global config parameter
+            of the same name. Default is the empty Map, meaning no overrides.
+
+Returns: The selected profile measurements.
+
+[Stellar]&gt;&gt;&gt; PROFILE_GET('test','192.168.138.158', 1, 'HOURS')
+[12078.0, 8921.0, 12131.0]
+</pre></div></div>
+<p>The client API call above has retrieved the past hour of the &#x2018;test&#x2019; profile for the entity &#x2018;192.168.138.158&#x2019;.</p></div>
+                  </div>
+            </div>
+          </div>
+
+    <hr/>
+
+    <footer>
+            <div class="container-fluid">
+              <div class="row span12">Copyright &copy;                    2017
+                        <a href="https://www.apache.org">The Apache Software Foundation</a>.
+            All Rights Reserved.      
+                    
+      </div>
+
+                          
+        
+                </div>
+    </footer>
+  </body>
+</html>

Added: release/metron/0.4.0/site-book/metron-analytics/metron-profiler/index.html
==============================================================================
--- release/metron/0.4.0/site-book/metron-analytics/metron-profiler/index.html (added)
+++ release/metron/0.4.0/site-book/metron-analytics/metron-profiler/index.html Wed Jul  5 06:56:42 2017
@@ -0,0 +1,845 @@
+<!DOCTYPE html>
+<!--
+ | Generated by Apache Maven Doxia at 2017-06-27
+ | Rendered using Apache Maven Fluido Skin 1.3.0
+-->
+<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
+  <head>
+    <meta charset="UTF-8" />
+    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
+    <meta name="Date-Revision-yyyymmdd" content="20170627" />
+    <meta http-equiv="Content-Language" content="en" />
+    <title>Metron &#x2013; Metron Profiler</title>
+    <link rel="stylesheet" href="../../css/apache-maven-fluido-1.3.0.min.css" />
+    <link rel="stylesheet" href="../../css/site.css" />
+    <link rel="stylesheet" href="../../css/print.css" media="print" />
+
+      
+    <script type="text/javascript" src="../../js/apache-maven-fluido-1.3.0.min.js"></script>
+
+                          
+        
+<script type="text/javascript">$( document ).ready( function() { $( '.carousel' ).carousel( { interval: 3500 } ) } );</script>
+          
+            </head>
+        <body class="topBarDisabled">
+          
+                
+                    
+    
+        <div class="container-fluid">
+          <div id="banner">
+        <div class="pull-left">
+                                    <a href="http://metron.apache.org/" id="bannerLeft">
+                                                                                                <img src="../../images/metron-logo.png"  alt="Apache Metron" width="148px" height="48px"/>
+                </a>
+                      </div>
+        <div class="pull-right">  </div>
+        <div class="clear"><hr/></div>
+      </div>
+
+      <div id="breadcrumbs">
+        <ul class="breadcrumb">
+                
+                    
+                              <li class="">
+                    <a href="http://www.apache.org" class="externalLink" title="Apache">
+        Apache</a>
+        </li>
+      <li class="divider ">/</li>
+            <li class="">
+                    <a href="http://metron.apache.org/" class="externalLink" title="Metron">
+        Metron</a>
+        </li>
+      <li class="divider ">/</li>
+            <li class="">
+                    <a href="../../index.html" title="Documentation">
+        Documentation</a>
+        </li>
+      <li class="divider ">/</li>
+        <li class="">Metron Profiler</li>
+        
+                
+                    
+                  <li id="publishDate" class="pull-right">Last Published: 2017-06-27</li> <li class="divider pull-right">|</li>
+              <li id="projectVersion" class="pull-right">Version: 0.4.0</li>
+            
+                            </ul>
+      </div>
+
+            
+      <div class="row-fluid">
+        <div id="leftColumn" class="span3">
+          <div class="well sidebar-nav">
+                
+                    
+                <ul class="nav nav-list">
+                    <li class="nav-header">User Documentation</li>
+                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
+      <li>
+    
+                          <a href="../../index.html" title="Metron">
+          <i class="icon-chevron-down"></i>
+        Metron</a>
+                    <ul class="nav nav-list">
+                      
+      <li>
+    
+                          <a href="../../Upgrading.html" title="Upgrading">
+          <i class="none"></i>
+        Upgrading</a>
+            </li>
+                                                                                                                                                                
+      <li>
+    
+                          <a href="../../metron-analytics/index.html" title="Analytics">
+          <i class="icon-chevron-down"></i>
+        Analytics</a>
+                    <ul class="nav nav-list">
+                      
+      <li>
+    
+                          <a href="../../metron-analytics/metron-maas-service/index.html" title="Maas-service">
+          <i class="none"></i>
+        Maas-service</a>
+            </li>
+                      
+      <li class="active">
+    
+            <a href="#"><i class="none"></i>Profiler</a>
+          </li>
+                      
+      <li>
+    
+                          <a href="../../metron-analytics/metron-profiler-client/index.html" title="Profiler-client">
+          <i class="none"></i>
+        Profiler-client</a>
+            </li>
+                                                                        
+      <li>
+    
+                          <a href="../../metron-analytics/metron-statistics/index.html" title="Statistics">
+          <i class="icon-chevron-right"></i>
+        Statistics</a>
+                  </li>
+              </ul>
+        </li>
+                                                                                                                                                                                                                                                                                                                                                                                    
+      <li>
+    
+                          <a href="../../metron-deployment/index.html" title="Deployment">
+          <i class="icon-chevron-right"></i>
+        Deployment</a>
+                  </li>
+                      
+      <li>
+    
+                          <a href="../../metron-docker/index.html" title="Docker">
+          <i class="none"></i>
+        Docker</a>
+            </li>
+                      
+      <li>
+    
+                          <a href="../../metron-interface/metron-config/index.html" title="Config">
+          <i class="none"></i>
+        Config</a>
+            </li>
+                      
+      <li>
+    
+                          <a href="../../metron-interface/metron-rest/index.html" title="Rest">
+          <i class="none"></i>
+        Rest</a>
+            </li>
+                                                                                                                                                                                                                                                
+      <li>
+    
+                          <a href="../../metron-platform/index.html" title="Platform">
+          <i class="icon-chevron-right"></i>
+        Platform</a>
+                  </li>
+                                                                                                            
+      <li>
+    
+                          <a href="../../metron-sensors/index.html" title="Sensors">
+          <i class="icon-chevron-right"></i>
+        Sensors</a>
+                  </li>
+              </ul>
+        </li>
+            </ul>
+                
+                    
+                
+          <hr class="divider" />
+
+           <div id="poweredBy">
+                            <div class="clear"></div>
+                            <div class="clear"></div>
+                            <div class="clear"></div>
+                             <a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy">
+        <img class="builtBy" alt="Built by Maven" src="../../images/logos/maven-feather.png" />
+      </a>
+                  </div>
+          </div>
+        </div>
+        
+                
+        <div id="bodyColumn"  class="span9" >
+                                  
+            <h1>Metron Profiler</h1>
+<p><a name="Metron_Profiler"></a></p>
+<p>The Profiler is a feature extraction mechanism that can generate a profile describing the behavior of an entity. An entity might be a server, user, subnet or application. Once a profile has been generated defining what normal behavior looks-like, models can be built that identify anomalous behavior. </p>
+<p>This is achieved by summarizing the streaming telemetry data consumed by Metron over sliding windows. A summary statistic is applied to the data received within a given window. Collecting this summary across many windows results in a time series that is useful for analysis.</p>
+<p>Any field contained within a message can be used to generate a profile. A profile can even be produced by combining fields that originate in different data sources. A user has considerable power to transform the data used in a profile by leveraging the Stellar language. A user only need configure the desired profiles and ensure that the Profiler topology is running.</p>
+
+<ul>
+  
+<li><a href="#Getting_Started">Getting Started</a></li>
+  
+<li><a href="#Creating_Profiles">Creating Profiles</a></li>
+  
+<li><a href="#Configuring_the_Profiler">Configuring the Profiler</a></li>
+  
+<li><a href="#Examples">Examples</a></li>
+  
+<li><a href="#Implementation">Implementation</a></li>
+</ul>
+<div class="section">
+<h2><a name="Getting_Started"></a>Getting Started</h2>
+<p>This section will describe the steps required to get your first profile running.</p>
+
+<ol style="list-style-type: decimal">
+  
+<li>
+<p>Stand-up a Metron environment. For this example, we will use the &#x2018;Quick Dev&#x2019; environment. Follow the instructions included with <a href="../../metron-deployment/vagrant/quick-dev-platform/index.html">Quick Dev</a> or build your own.</p></li>
+  
+<li>
+<p>Create a table within HBase that will store the profile data. The table name and column family must match the <a href="#configuring-the-profiler">Profiler&#x2019;s configuration</a>.</p>
+  
+<div class="source">
+<div class="source">
+<pre>$ /usr/hdp/current/hbase-client/bin/hbase shell
+hbase(main):001:0&gt; create 'profiler', 'P'
+</pre></div></div></li>
+  
+<li>
+<p>Edit the configuration file located at <tt>$METRON_HOME/config/profiler.properties</tt>. Change the kafka.zk and kafka.broker values from &#x201c;node1&#x201d; to the appropriate host name. Keep the same port numbers:</p>
+  
+<div class="source">
+<div class="source">
+<pre>kafka.zk=node1:2181
+kafka.broker=node1:6667
+</pre></div></div></li>
+  
+<li>
+<p>Define the profile in a file located at <tt>$METRON_HOME/config/zookeeper/profiler.json</tt>. The following example JSON will create a profile that simply counts the number of messages per <tt>ip_src_addr</tt>, during each sampling interval.</p>
+  
+<div class="source">
+<div class="source">
+<pre>{
+  &quot;profiles&quot;: [
+    {
+      &quot;profile&quot;: &quot;test&quot;,
+      &quot;foreach&quot;: &quot;ip_src_addr&quot;,
+      &quot;init&quot;:    { &quot;count&quot;: &quot;0&quot; },
+      &quot;update&quot;:  { &quot;count&quot;: &quot;count + 1&quot; },
+      &quot;result&quot;:  &quot;count&quot;
+    }
+  ]
+}
+</pre></div></div></li>
+  
+<li>
+<p>Upload the profile definition to Zookeeper. (As always, change &#x201c;node1&#x201d; to the actual hostname.)</p>
+  
+<div class="source">
+<div class="source">
+<pre>$ cd $METRON_HOME
+$ bin/zk_load_configs.sh -m PUSH -i config/zookeeper/ -z node1:2181
+</pre></div></div></li>
+  
+<li>
+<p>Start the Profiler topology.</p>
+  
+<div class="source">
+<div class="source">
+<pre>$ bin/start_profiler_topology.sh
+</pre></div></div></li>
+  
+<li>
+<p>Ensure that test messages are being sent to the Profiler&#x2019;s input topic in Kafka. The Profiler will consume messages from the <tt>inputTopic</tt> defined in the <a href="#configuring-the-profiler">Profiler&#x2019;s configuration</a>.</p></li>
+  
+<li>
+<p>Check the HBase table to validate that the Profiler is writing the profile. Remember that the Profiler is flushing the profile every 15 minutes. You will need to wait at least this long to start seeing profile data in HBase.</p>
+  
+<div class="source">
+<div class="source">
+<pre>$ /usr/hdp/current/hbase-client/bin/hbase shell
+hbase(main):001:0&gt; count 'profiler'
+</pre></div></div></li>
+  
+<li>
+<p>Use the Profiler Client to read the profile data. The below example <tt>PROFILE_GET</tt> command will read data written by the sample profile given above, if 10.0.0.1 is one of the input values for <tt>ip_src_addr</tt>. More information on configuring and using the client can be found <a href="../metron-profiler-client/index.html">here</a>. It is assumed that the <tt>PROFILE_GET</tt> client is correctly configured before using it.</p>
+  
+<div class="source">
+<div class="source">
+<pre>$ bin/stellar -z node1:2181
+[Stellar]&gt;&gt;&gt; PROFILE_GET( &quot;test&quot;, &quot;10.0.0.1&quot;, PROFILE_FIXED(30, &quot;MINUTES&quot;))
+[451, 448]
+</pre></div></div></li>
+</ol></div>
+<div class="section">
+<h2><a name="Creating_Profiles"></a>Creating Profiles</h2>
+<p>The Profiler specification requires a JSON-formatted set of elements, many of which can contain Stellar code. The specification contains the following elements. (For the impatient, skip ahead to the <a href="#Examples">Examples</a>.) The specification for the Profiler topology is stored in Zookeeper at <tt>/metron/topology/profiler</tt>. These properties also exist in the local filesystem at <tt>$METRON_HOME/config/zookeeper/profiler.json</tt>. The values can be changed on disk and then uploaded to Zookeeper using <tt>$METRON_HOME/bin/zk_load_configs.sh</tt>.</p>
+
+<table border="0" class="table table-striped">
+  <thead>
+    
+<tr class="a">
+      
+<th>Name </th>
+      
+<th> </th>
+      
+<th>Description</th>
+    </tr>
+  </thead>
+  <tbody>
+    
+<tr class="b">
+      
+<td><a href="#profile">profile</a> </td>
+      
+<td>Required </td>
+      
+<td>Unique name identifying the profile.</td>
+    </tr>
+    
+<tr class="a">
+      
+<td><a href="#foreach">foreach</a> </td>
+      
+<td>Required </td>
+      
+<td>A separate profile is maintained &#x201c;for each&#x201d; of these.</td>
+    </tr>
+    
+<tr class="b">
+      
+<td><a href="#onlyif">onlyif</a> </td>
+      
+<td>Optional </td>
+      
+<td>Boolean expression that determines if a message should be applied to the profile.</td>
+    </tr>
+    
+<tr class="a">
+      
+<td><a href="#groupBy">groupBy</a> </td>
+      
+<td>Optional </td>
+      
+<td>One or more Stellar expressions used to group the profile measurements when persisted.</td>
+    </tr>
+    
+<tr class="b">
+      
+<td><a href="#init">init</a> </td>
+      
+<td>Optional </td>
+      
+<td>One or more expressions executed at the start of a window period.</td>
+    </tr>
+    
+<tr class="a">
+      
+<td><a href="#update">update</a> </td>
+      
+<td>Required </td>
+      
+<td>One or more expressions executed when a message is applied to the profile.</td>
+    </tr>
+    
+<tr class="b">
+      
+<td><a href="#result">result</a> </td>
+      
+<td>Required </td>
+      
+<td>Stellar expressions that are executed when the window period expires.</td>
+    </tr>
+    
+<tr class="a">
+      
+<td><a href="#expires">expires</a> </td>
+      
+<td>Optional </td>
+      
+<td>Profile data is purged after this period of time, specified in milliseconds.</td>
+    </tr>
+  </tbody>
+</table>
+<div class="section">
+<h3><a name="profile"></a><tt>profile</tt></h3>
+<p><i>Required</i></p>
+<p>A unique name identifying the profile. The field is treated as a string. </p></div>
+<div class="section">
+<h3><a name="foreach"></a><tt>foreach</tt></h3>
+<p><i>Required</i></p>
+<p>A separate profile is maintained &#x2018;for each&#x2019; of these. This is effectively the entity that the profile is describing. The field is expected to contain a Stellar expression whose result is the entity name. </p>
+<p>For example, if <tt>ip_src_addr</tt> then a separate profile would be maintained for each unique IP source address in the data; 10.0.0.1, 10.0.0.2, etc.</p></div>
+<div class="section">
+<h3><a name="onlyif"></a><tt>onlyif</tt></h3>
+<p><i>Optional</i></p>
+<p>An expression that determines if a message should be applied to the profile. A Stellar expression that returns a Boolean is expected. A message is only applied to a profile if this expression is true. This allows a profile to filter the messages that get applied to it. </p></div>
+<div class="section">
+<h3><a name="groupBy"></a><tt>groupBy</tt></h3>
+<p><i>Optional</i></p>
+<p>One or more Stellar expressions used to group the profile measurements when persisted. This is intended to sort the Profile data to allow for a contiguous scan when accessing subsets of the data. </p>
+<p>The &#x2018;groupBy&#x2019; expressions can refer to any field within a <tt>org.apache.metron.profiler.ProfileMeasurement</tt>. A common use case would be grouping by day of week. This allows a contiguous scan to access all profile data for Mondays only. Using the following definition would achieve this. </p>
+
+<div class="source">
+<div class="source">
+<pre>&quot;groupBy&quot;: [ &quot;DAY_OF_WEEK()&quot; ] 
+</pre></div></div></div>
+<div class="section">
+<h3><a name="init"></a><tt>init</tt></h3>
+<p><i>Optional</i></p>
+<p>One or more expressions executed at the start of a window period. A map is expected where the key is the variable name and the value is a Stellar expression. The map can contain zero or more variable:expression pairs. At the start of each window period, each expression is executed once and stored in the given variable. Note that constant init values such as &#x201c;0&#x201d; must be in quotes regardless of their type, as the init value must be a string to be executed by Stellar.</p>
+
+<div class="source">
+<div class="source">
+<pre>&quot;init&quot;: {
+  &quot;var1&quot;: &quot;0&quot;,
+  &quot;var2&quot;: &quot;1&quot;
+}
+</pre></div></div></div>
+<div class="section">
+<h3><a name="update"></a><tt>update</tt></h3>
+<p><i>Required</i></p>
+<p>One or more expressions executed when a message is applied to the profile. A map is expected where the key is the variable name and the value is a Stellar expression. The map can include 0 or more variables/expressions. When each message is applied to the profile, the expression is executed and stored in a variable with the given name.</p>
+
+<div class="source">
+<div class="source">
+<pre>&quot;update&quot;: {
+  &quot;var1&quot;: &quot;var1 + 1&quot;,
+  &quot;var2&quot;: &quot;var2 + 1&quot;
+}
+</pre></div></div></div>
+<div class="section">
+<h3><a name="result"></a><tt>result</tt></h3>
+<p><i>Required</i></p>
+<p>Stellar expressions that are executed when the window period expires. The expressions are expected to summarize the messages that were applied to the profile over the window period. In the most basic form a single result is persisted for later retrieval.</p>
+
+<div class="source">
+<div class="source">
+<pre>&quot;result&quot;: &quot;var1 + var2&quot;
+</pre></div></div>
+<p>For more advanced use cases, a profile can generate two types of results. A profile can define one or both of these result types at the same time. </p>
+
+<ul>
+  
+<li><tt>profile</tt>: A required expression that defines a value that is persisted for later retrieval.</li>
+  
+<li><tt>triage</tt>: An optional expression that defines values that are accessible within the Threat Triage process.</li>
+</ul>
+<p><b>profile</b></p>
+<p>A required Stellar expression that results in a value that is persisted in the profile store for later retrieval. The expression can result in any object that is Kryo serializable. These values can be retrieved for later use with the <a href="../metron-profiler-client/index.html">Profiler Client</a>. </p>
+
+<div class="source">
+<div class="source">
+<pre>&quot;result&quot;: {
+    &quot;profile&quot;: &quot;2 + 2&quot;
+}
+</pre></div></div>
+<p>An alternative, simplified form is also acceptable.</p>
+
+<div class="source">
+<div class="source">
+<pre>&quot;result&quot;: &quot;2 + 2&quot;
+</pre></div></div>
+<p><b>triage</b></p>
+<p>An optional map of one or more Stellar expressions. The value of each expression is made available to the Threat Triage process under the given name. Each expression must result in a either a primitive type, like an integer, long, or short, or a String. All other types will result in an error.</p>
+<p>In the following example, three values, the minimum, the maximum and the mean are appended to a message. This message is consumed by Metron, like other sources of telemetry, and each of these values are accessible from within the Threat Triage process using the given field names; <tt>min</tt>, <tt>max</tt>, and <tt>mean</tt>.</p>
+
+<div class="source">
+<div class="source">
+<pre>&quot;result&quot;: {
+    &quot;triage&quot;: {
+        &quot;min&quot;: &quot;STATS_MIN(stats)&quot;,
+        &quot;max&quot;: &quot;STATS_MAX(stats)&quot;,
+        &quot;mean&quot;: &quot;STATS_MEAN(stats)&quot;
+    }
+}
+</pre></div></div></div>
+<div class="section">
+<h3><a name="expires"></a><tt>expires</tt></h3>
+<p><i>Optional</i></p>
+<p>A numeric value that defines how many days the profile data is retained. After this time, the data expires and is no longer accessible. If no value is defined, the data does not expire.</p></div></div>
+<div class="section">
+<h2><a name="Configuring_the_Profiler"></a>Configuring the Profiler</h2>
+<p>The Profiler runs as an independent Storm topology. The configuration for the Profiler topology is stored in local filesystem at <tt>$METRON_HOME/config/profiler.properties</tt>. The values can be changed on disk and then the Profiler topology must be restarted.</p>
+
+<table border="0" class="table table-striped">
+  <thead>
+    
+<tr class="a">
+      
+<th>Setting </th>
+      
+<th>Description</th>
+    </tr>
+  </thead>
+  <tbody>
+    
+<tr class="b">
+      
+<td>profiler.workers </td>
+      
+<td>The number of worker processes to create for the topology.</td>
+    </tr>
+    
+<tr class="a">
+      
+<td>profiler.executors </td>
+      
+<td>The number of executors to spawn per component.</td>
+    </tr>
+    
+<tr class="b">
+      
+<td>profiler.input.topic </td>
+      
+<td>The name of the Kafka topic from which to consume data.</td>
+    </tr>
+    
+<tr class="a">
+      
+<td>profiler.output.topic </td>
+      
+<td>The name of the Kafka topic to which profile data is written. Only used with profiles that use the <a href="#result"><tt>triage</tt> result field</a>.</td>
+    </tr>
+    
+<tr class="b">
+      
+<td>profiler.period.duration </td>
+      
+<td>The duration of each profile period. This value should be defined along with <tt>profiler.period.duration.units</tt>.</td>
+    </tr>
+    
+<tr class="a">
+      
+<td>profiler.period.duration.units </td>
+      
+<td>The units used to specify the <tt>profiler.period.duration</tt>.</td>
+    </tr>
+    
+<tr class="b">
+      
+<td>profiler.ttl </td>
+      
+<td>If a message has not been applied to a Profile in this period of time, the Profile will be forgotten and its resources will be cleaned up. This value should be defined along with <tt>profiler.ttl.units</tt>.</td>
+    </tr>
+    
+<tr class="a">
+      
+<td>profiler.ttl.units </td>
+      
+<td>The units used to specify the <tt>profiler.ttl</tt>.</td>
+    </tr>
+    
+<tr class="b">
+      
+<td>profiler.hbase.salt.divisor </td>
+      
+<td>A salt is prepended to the row key to help prevent hotspotting. This constant is used to generate the salt. Ideally, this constant should be roughly equal to the number of nodes in the Hbase cluster.</td>
+    </tr>
+    
+<tr class="a">
+      
+<td>profiler.hbase.table </td>
+      
+<td>The name of the HBase table that profiles are written to.</td>
+    </tr>
+    
+<tr class="b">
+      
+<td>profiler.hbase.column.family </td>
+      
+<td>The column family used to store profiles.</td>
+    </tr>
+    
+<tr class="a">
+      
+<td>profiler.hbase.batch </td>
+      
+<td>The number of puts that are written in a single batch.</td>
+    </tr>
+    
+<tr class="b">
+      
+<td>profiler.hbase.flush.interval.seconds </td>
+      
+<td>The maximum number of seconds between batch writes to HBase.</td>
+    </tr>
+  </tbody>
+</table>
+<p>After altering the configuration, start the Profiler.</p>
+
+<div class="source">
+<div class="source">
+<pre>$ $METRON_HOME/start_profiler_topology.sh
+</pre></div></div></div>
+<div class="section">
+<h2><a name="Examples"></a>Examples</h2>
+<p>The following examples are intended to highlight the functionality provided by the Profiler. Each shows the configuration that would be required to generate the profile. </p>
+<p>These examples assume a fictitious input message stream that looks something like the following.</p>
+
+<div class="source">
+<div class="source">
+<pre>{
+  &quot;ip_src_addr&quot;: &quot;10.0.0.1&quot;,
+  &quot;protocol&quot;: &quot;HTTPS&quot;,
+  &quot;length&quot;: &quot;10&quot;,
+  &quot;bytes_in&quot;: &quot;234&quot;
+},
+{
+  &quot;ip_src_addr&quot;: &quot;10.0.0.2&quot;,
+  &quot;protocol&quot;: &quot;HTTP&quot;,
+  &quot;length&quot;: &quot;20&quot;,
+  &quot;bytes_in&quot;: &quot;390&quot;
+},
+{
+  &quot;ip_src_addr&quot;: &quot;10.0.0.3&quot;,
+  &quot;protocol&quot;: &quot;DNS&quot;,
+  &quot;length&quot;: &quot;30&quot;,
+  &quot;bytes_in&quot;: &quot;560&quot;
+}
+</pre></div></div>
+<div class="section">
+<h3><a name="Example_1"></a>Example 1</h3>
+<p>The total number of bytes of HTTP data for each host. The following configuration would be used to generate this profile.</p>
+
+<div class="source">
+<div class="source">
+<pre>{
+  &quot;profiles&quot;: [
+    {
+      &quot;profile&quot;: &quot;example1&quot;,
+      &quot;foreach&quot;: &quot;ip_src_addr&quot;,
+      &quot;onlyif&quot;: &quot;protocol == 'HTTP'&quot;,
+      &quot;init&quot;: {
+        &quot;total_bytes&quot;: 0.0
+      },
+      &quot;update&quot;: {
+        &quot;total_bytes&quot;: &quot;total_bytes + bytes_in&quot;
+      },
+      &quot;result&quot;: &quot;total_bytes&quot;,
+      &quot;expires&quot;: 30
+    }
+  ]
+}
+</pre></div></div>
+<p>This creates a profile&#x2026;</p>
+
+<ul>
+  
+<li>Named &#x2018;example1&#x2019;</li>
+  
+<li>That for each IP source address</li>
+  
+<li>Only if the &#x2018;protocol&#x2019; field equals &#x2018;HTTP&#x2019;</li>
+  
+<li>Initializes a counter &#x2018;total_bytes&#x2019; to zero</li>
+  
+<li>Adds to &#x2018;total_bytes&#x2019; the value of the message&#x2019;s &#x2018;bytes_in&#x2019; field</li>
+  
+<li>Returns &#x2018;total_bytes&#x2019; as the result</li>
+  
+<li>The profile data will expire in 30 days</li>
+</ul></div>
+<div class="section">
+<h3><a name="Example_2"></a>Example 2</h3>
+<p>The ratio of DNS traffic to HTTP traffic for each host. The following configuration would be used to generate this profile.</p>
+
+<div class="source">
+<div class="source">
+<pre>{
+  &quot;profiles&quot;: [
+    {
+      &quot;profile&quot;: &quot;example2&quot;,
+      &quot;foreach&quot;: &quot;ip_src_addr&quot;,
+      &quot;onlyif&quot;: &quot;protocol == 'DNS' or protocol == 'HTTP'&quot;,
+      &quot;init&quot;: {
+        &quot;num_dns&quot;: 1.0,
+        &quot;num_http&quot;: 1.0
+      },
+      &quot;update&quot;: {
+        &quot;num_dns&quot;: &quot;num_dns + (if protocol == 'DNS' then 1 else 0)&quot;,
+        &quot;num_http&quot;: &quot;num_http + (if protocol == 'HTTP' then 1 else 0)&quot;
+      },
+      &quot;result&quot;: &quot;num_dns / num_http&quot;
+    }
+  ]
+}
+</pre></div></div>
+<p>This creates a profile&#x2026;</p>
+
+<ul>
+  
+<li>Named &#x2018;example2&#x2019;</li>
+  
+<li>That for each IP source address</li>
+  
+<li>Only if the &#x2018;protocol&#x2019; field equals &#x2018;HTTP&#x2019; or &#x2018;DNS&#x2019;</li>
+  
+<li>Accumulates the number of DNS requests</li>
+  
+<li>Accumulates the number of HTTP requests</li>
+  
+<li>Returns the ratio of these as the result</li>
+</ul></div>
+<div class="section">
+<h3><a name="Example_3"></a>Example 3</h3>
+<p>The average of the <tt>length</tt> field of HTTP traffic. The following configuration would be used to generate this profile.</p>
+
+<div class="source">
+<div class="source">
+<pre>{
+  &quot;profiles&quot;: [
+    {
+      &quot;profile&quot;: &quot;example3&quot;,
+      &quot;foreach&quot;: &quot;ip_src_addr&quot;,
+      &quot;onlyif&quot;: &quot;protocol == 'HTTP'&quot;,
+      &quot;update&quot;: { &quot;s&quot;: &quot;STATS_ADD(s, length)&quot; },
+      &quot;result&quot;: &quot;STATS_MEAN(s)&quot;
+    }
+  ]
+}
+</pre></div></div>
+<p>This creates a profile&#x2026;</p>
+
+<ul>
+  
+<li>Named &#x2018;example3&#x2019;</li>
+  
+<li>That for each IP source address</li>
+  
+<li>Only if the &#x2018;protocol&#x2019; field is &#x2018;HTTP&#x2019;</li>
+  
+<li>Adds the <tt>length</tt> field from each message</li>
+  
+<li>Calculates the average as the result</li>
+</ul></div>
+<div class="section">
+<h3><a name="Example_4"></a>Example 4</h3>
+<p>It is important to note that the Profiler can persist any serializable Object, not just numeric values. An alternative to the previous example could take advantage of this. </p>
+<p>Instead of storing the mean of the lengths, the profile could store a statistical summarization of the lengths. This summary can then be used at a later time to calculate the mean, min, max, percentiles, or any other sensible metric. This provides a much greater degree of flexibility.</p>
+
+<div class="source">
+<div class="source">
+<pre>{
+  &quot;profiles&quot;: [
+    {
+      &quot;profile&quot;: &quot;example4&quot;,
+      &quot;foreach&quot;: &quot;ip_src_addr&quot;,
+      &quot;onlyif&quot;: &quot;protocol == 'HTTP'&quot;,
+      &quot;update&quot;: { &quot;s&quot;: &quot;STATS_ADD(s, length)&quot; },
+      &quot;result&quot;: &quot;s&quot;
+    }
+  ]
+}
+</pre></div></div>
+<p>The following Stellar REPL session shows how you might use this summary to calculate different metrics with the same underlying profile data. It is assumed that the PROFILE_GET client is configured as described <a href="../metron-profiler-client/index.html">here</a>.</p>
+<p>Retrieve the last 30 minutes of profile measurements for a specific host.</p>
+
+<div class="source">
+<div class="source">
+<pre>$ bin/stellar -z node1:2181
+
+[Stellar]&gt;&gt;&gt; stats := PROFILE_GET( &quot;example4&quot;, &quot;10.0.0.1&quot;, PROFILE_FIXED(30, &quot;MINUTES&quot;))
+[Stellar]&gt;&gt;&gt; stats
+[org.apache.metron.common.math.stats.OnlineStatisticsProvider@79fe4ab9, ...]
+</pre></div></div>
+<p>Calculate different metrics with the same profile data.</p>
+
+<div class="source">
+<div class="source">
+<pre>[Stellar]&gt;&gt;&gt; STATS_MEAN( GET_FIRST( stats))
+15979.0625
+
+[Stellar]&gt;&gt;&gt; STATS_PERCENTILE( GET_FIRST(stats), 90)
+30310.958
+</pre></div></div>
+<p>Merge all of the profile measurements over the past 30 minutes into a single summary and calculate the 90th percentile.</p>
+
+<div class="source">
+<div class="source">
+<pre>[Stellar]&gt;&gt;&gt; merged := STATS_MERGE( stats)
+[Stellar]&gt;&gt;&gt; STATS_PERCENTILE(merged, 90)
+29810.992
+</pre></div></div>
+<p>More information on accessing profile data can be found in the <a href="../metron-profiler-client/index.html">Profiler Client</a>.</p>
+<p>More information on using the <a href="../../metron-platform/metron-common/index.html"><tt>STATS_*</tt> functions in Stellar can be found here</a>.</p></div></div>
+<div class="section">
+<h2><a name="Implementation"></a>Implementation</h2></div>
+<div class="section">
+<h2><a name="Key_Classes"></a>Key Classes</h2>
+
+<ul>
+  
+<li>
+<p><tt>ProfileMeasurement</tt> - Represents a single data point within a Profile. A Profile is effectively a time series. To this end a Profile is composed of many ProfileMeasurement values which in aggregate form a time series.</p></li>
+  
+<li>
+<p><tt>ProfilePeriod</tt> - The Profiler captures one <tt>ProfileMeasurement</tt> each <tt>ProfilePeriod</tt>. A <tt>ProfilePeriod</tt> will occur at fixed, deterministic points in time. This allows for efficient retrieval of profile data.</p></li>
+  
+<li>
+<p><tt>RowKeyBuilder</tt> - Builds row keys that can be used to read or write profile data to HBase.</p></li>
+  
+<li>
+<p><tt>ColumnBuilder</tt> - Defines the columns of data stored with a profile measurement.</p></li>
+  
+<li>
+<p><tt>ProfileHBaseMapper</tt> - Defines for the <tt>HBaseBolt</tt> how profile measurements are stored in HBase. This class leverages a <tt>RowKeyBuilder</tt> and <tt>ColumnBuilder</tt>.</p></li>
+</ul></div>
+<div class="section">
+<h2><a name="Storm_Topology"></a>Storm Topology</h2>
+<p>The Profiler is implemented as a Storm topology using the following bolts and spouts.</p>
+
+<ul>
+  
+<li>
+<p><tt>KafkaSpout</tt> - A spout that consumes messages from a single Kafka topic. In most cases, the Profiler topology will consume messages from the <tt>indexing</tt> topic. This topic contains fully enriched messages that are ready to be indexed. This ensures that profiles can take advantage of all the available data elements.</p></li>
+  
+<li>
+<p><tt>ProfileSplitterBolt</tt> - The bolt responsible for filtering incoming messages and directing each to the one or more downstream bolts that are responsible for building a profile. Each message may be needed by 0, 1 or even many profiles. Each emitted tuple contains the &#x2018;resolved&#x2019; entity name, the profile definition, and the input message.</p></li>
+  
+<li>
+<p><tt>ProfileBuilderBolt</tt> - This bolt maintains all of the state required to build a profile. When the window period expires, the data is summarized as a <tt>ProfileMeasurement</tt>, all state is flushed, and the <tt>ProfileMeasurement</tt> is emitted. Each instance of this bolt is responsible for maintaining the state for a single Profile-Entity pair.</p></li>
+  
+<li>
+<p><tt>HBaseBolt</tt> - A bolt that is responsible for writing to HBase. Most profiles will be flushed every 15 minutes or so. If each <tt>ProfileBuilderBolt</tt> were responsible for writing to HBase itself, there would be little to no opportunity to optimize these writes. By aggregating the writes from multiple Profile-Entity pairs these writes can be batched, for example.</p></li>
+</ul></div>
+                  </div>
+            </div>
+          </div>
+
+    <hr/>
+
+    <footer>
+            <div class="container-fluid">
+              <div class="row span12">Copyright &copy;                    2017
+                        <a href="https://www.apache.org">The Apache Software Foundation</a>.
+            All Rights Reserved.      
+                    
+      </div>
+
+                          
+        
+                </div>
+    </footer>
+  </body>
+</html>



Mime
View raw message