metron-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ceste...@apache.org
Subject [18/50] [abbrv] metron git commit: METRON-1607 update public web site to point at 0.5.0 new release (justinleet) closes apache/metron#1053
Date Mon, 11 Jun 2018 21:53:42 GMT
http://git-wip-us.apache.org/repos/asf/metron/blob/ae1d3eb9/site/current-book/metron-platform/metron-elasticsearch/index.html
----------------------------------------------------------------------
diff --git a/site/current-book/metron-platform/metron-elasticsearch/index.html b/site/current-book/metron-platform/metron-elasticsearch/index.html
index ff4bfb0..cc360b1 100644
--- a/site/current-book/metron-platform/metron-elasticsearch/index.html
+++ b/site/current-book/metron-platform/metron-elasticsearch/index.html
@@ -1,380 +1,484 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia at 2018-01-03
- | Rendered using Apache Maven Fluido Skin 1.3.0
+ | Generated by Apache Maven Doxia Site Renderer 1.8 from src/site/markdown/metron-platform/metron-elasticsearch/index.md at 2018-06-07
+ | Rendered using Apache Maven Fluido Skin 1.7
 -->
 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20180103" />
+    <meta name="Date-Revision-yyyymmdd" content="20180607" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Metron &#x2013; Elasticsearch in Metron</title>
-    <link rel="stylesheet" href="../../css/apache-maven-fluido-1.3.0.min.css" />
+    <link rel="stylesheet" href="../../css/apache-maven-fluido-1.7.min.css" />
     <link rel="stylesheet" href="../../css/site.css" />
     <link rel="stylesheet" href="../../css/print.css" media="print" />
-
-      
-    <script type="text/javascript" src="../../js/apache-maven-fluido-1.3.0.min.js"></script>
-
-                          
-        
-<script type="text/javascript">$( document ).ready( function() { $( '.carousel' ).carousel( { interval: 3500 } ) } );</script>
-          
-            </head>
-        <body class="topBarDisabled">
-          
-                
-                    
-    
-        <div class="container-fluid">
-          <div id="banner">
-        <div class="pull-left">
-                                    <a href="http://metron.apache.org/" id="bannerLeft">
-                                                                                                <img src="../../images/metron-logo.png"  alt="Apache Metron" width="148px" height="48px"/>
-                </a>
-                      </div>
-        <div class="pull-right">  </div>
+    <script type="text/javascript" src="../../js/apache-maven-fluido-1.7.min.js"></script>
+<script type="text/javascript">
+              $( document ).ready( function() { $( '.carousel' ).carousel( { interval: 3500 } ) } );
+            </script>
+  </head>
+  <body class="topBarDisabled">
+    <div class="container-fluid">
+      <div id="banner">
+        <div class="pull-left"><a href="http://metron.apache.org/" id="bannerLeft"><img src="../../images/metron-logo.png"  alt="Apache Metron" width="148px" height="48px"/></a></div>
+        <div class="pull-right"></div>
         <div class="clear"><hr/></div>
       </div>
 
       <div id="breadcrumbs">
         <ul class="breadcrumb">
-                
-                    
-                              <li class="">
-                    <a href="http://www.apache.org" class="externalLink" title="Apache">
-        Apache</a>
-        </li>
-      <li class="divider ">/</li>
-            <li class="">
-                    <a href="http://metron.apache.org/" class="externalLink" title="Metron">
-        Metron</a>
-        </li>
-      <li class="divider ">/</li>
-            <li class="">
-                    <a href="../../index.html" title="Documentation">
-        Documentation</a>
-        </li>
-      <li class="divider ">/</li>
-        <li class="">Elasticsearch in Metron</li>
-        
-                
-                    
-                  <li id="publishDate" class="pull-right">Last Published: 2018-01-03</li> <li class="divider pull-right">|</li>
-              <li id="projectVersion" class="pull-right">Version: 0.4.2</li>
-            
-                            </ul>
+      <li class=""><a href="http://www.apache.org" class="externalLink" title="Apache">Apache</a><span class="divider">/</span></li>
+      <li class=""><a href="http://metron.apache.org/" class="externalLink" title="Metron">Metron</a><span class="divider">/</span></li>
+      <li class=""><a href="../../index.html" title="Documentation">Documentation</a><span class="divider">/</span></li>
+    <li class="active ">Elasticsearch in Metron</li>
+        <li id="publishDate" class="pull-right"><span class="divider">|</span> Last Published: 2018-06-07</li>
+          <li id="projectVersion" class="pull-right">Version: 0.5.0</li>
+        </ul>
       </div>
-
-            
       <div class="row-fluid">
-        <div id="leftColumn" class="span3">
+        <div id="leftColumn" class="span2">
           <div class="well sidebar-nav">
-                
-                    
-                <ul class="nav nav-list">
-                    <li class="nav-header">User Documentation</li>

                                                                          
-      <li>
-    
-                          <a href="../../index.html" title="Metron">
-          <i class="icon-chevron-down"></i>
-        Metron</a>
-                    <ul class="nav nav-list">
-                      
-      <li>
-    
-                          <a href="../../Upgrading.html" title="Upgrading">
-          <i class="none"></i>
-        Upgrading</a>
-            </li>
-                                                                                                                                                      
-      <li>
-    
-                          <a href="../../metron-analytics/index.html" title="Analytics">
-          <i class="icon-chevron-right"></i>
-        Analytics</a>
-                  </li>
-                      
-      <li>
-    
-                          <a href="../../metron-contrib/metron-docker/index.html" title="Docker">
-          <i class="none"></i>
-        Docker</a>
-            </li>
-                                                                                                                                                                                                                                                                                                                                                                                                            
-      <li>
-    
-                          <a href="../../metron-deployment/index.html" title="Deployment">
-          <i class="icon-chevron-right"></i>
-        Deployment</a>
-                  </li>
-                      
-      <li>
-    
-                          <a href="../../metron-interface/metron-alerts/index.html" title="Alerts">
-          <i class="none"></i>
-        Alerts</a>
-            </li>
-                      
-      <li>
-    
-                          <a href="../../metron-interface/metron-config/index.html" title="Config">
-          <i class="none"></i>
-        Config</a>
-            </li>
-                      
-      <li>
-    
-                          <a href="../../metron-interface/metron-rest/index.html" title="Rest">
-          <i class="none"></i>
-        Rest</a>
-            </li>
-                                                                                                                                                                                                                                                                                              
-      <li>
-    
-                          <a href="../../metron-platform/index.html" title="Platform">
-          <i class="icon-chevron-down"></i>
-        Platform</a>
-                    <ul class="nav nav-list">
-                      
-      <li>
-    
-                          <a href="../../metron-platform/Performance-tuning-guide.html" title="Performance-tuning-guide">
-          <i class="none"></i>
-        Performance-tuning-guide</a>
-            </li>
-                      
-      <li>
-    
-                          <a href="../../metron-platform/metron-api/index.html" title="Api">
-          <i class="none"></i>
-        Api</a>
-            </li>
-                      
-      <li>
-    
-                          <a href="../../metron-platform/metron-common/index.html" title="Common">
-          <i class="none"></i>
-        Common</a>
-            </li>
-                      
-      <li>
-    
-                          <a href="../../metron-platform/metron-data-management/index.html" title="Data-management">
-          <i class="none"></i>
-        Data-management</a>
-            </li>
-                      
-      <li class="active">
-    
-            <a href="#"><i class="none"></i>Elasticsearch</a>
-          </li>
-                      
-      <li>
-    
-                          <a href="../../metron-platform/metron-enrichment/index.html" title="Enrichment">
-          <i class="none"></i>
-        Enrichment</a>
-            </li>
-                      
-      <li>
-    
-                          <a href="../../metron-platform/metron-indexing/index.html" title="Indexing">
-          <i class="none"></i>
-        Indexing</a>
-            </li>
-                      
-      <li>
-    
-                          <a href="../../metron-platform/metron-management/index.html" title="Management">
-          <i class="none"></i>
-        Management</a>
-            </li>
-                                                                        
-      <li>
-    
-                          <a href="../../metron-platform/metron-parsers/index.html" title="Parsers">
-          <i class="icon-chevron-right"></i>
-        Parsers</a>
-                  </li>
-                      
-      <li>
-    
-                          <a href="../../metron-platform/metron-pcap-backend/index.html" title="Pcap-backend">
-          <i class="none"></i>
-        Pcap-backend</a>
-            </li>
-                      
-      <li>
-    
-                          <a href="../../metron-platform/metron-writer/index.html" title="Writer">
-          <i class="none"></i>
-        Writer</a>
-            </li>
-              </ul>
-        </li>
-                                                                                          
-      <li>
-    
-                          <a href="../../metron-sensors/index.html" title="Sensors">
-          <i class="icon-chevron-right"></i>
-        Sensors</a>
-                  </li>
-                      
-      <li>
-    
-                          <a href="../../metron-stellar/stellar-3rd-party-example/index.html" title="Stellar-3rd-party-example">
-          <i class="none"></i>
-        Stellar-3rd-party-example</a>
-            </li>
-                                                                        
-      <li>
-    
-                          <a href="../../metron-stellar/stellar-common/index.html" title="Stellar-common">
-          <i class="icon-chevron-right"></i>
-        Stellar-common</a>
-                  </li>
-                                                                                          
-      <li>
-    
-                          <a href="../../use-cases/index.html" title="Use-cases">
-          <i class="icon-chevron-right"></i>
-        Use-cases</a>
-                  </li>
-              </ul>
-        </li>
-            </ul>
-                
-                    
-                
-          <hr class="divider" />
-
-           <div id="poweredBy">
-                            <div class="clear"></div>
-                            <div class="clear"></div>
-                            <div class="clear"></div>
-                             <a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy">
-        <img class="builtBy" alt="Built by Maven" src="../../images/logos/maven-feather.png" />
-      </a>
-                  </div>
+    <ul class="nav nav-list">
+      <li class="nav-header">User Documentation</li>
+    <li><a href="../../index.html" title="Metron"><span class="icon-chevron-down"></span>Metron</a>
+    <ul class="nav nav-list">
+    <li><a href="../../CONTRIBUTING.html" title="CONTRIBUTING"><span class="none"></span>CONTRIBUTING</a></li>
+    <li><a href="../../Upgrading.html" title="Upgrading"><span class="none"></span>Upgrading</a></li>
+    <li><a href="../../metron-analytics/index.html" title="Analytics"><span class="icon-chevron-right"></span>Analytics</a></li>
+    <li><a href="../../metron-contrib/metron-docker/index.html" title="Docker"><span class="none"></span>Docker</a></li>
+    <li><a href="../../metron-contrib/metron-performance/index.html" title="Performance"><span class="none"></span>Performance</a></li>
+    <li><a href="../../metron-deployment/index.html" title="Deployment"><span class="icon-chevron-right"></span>Deployment</a></li>
+    <li><a href="../../metron-interface/metron-alerts/index.html" title="Alerts"><span class="none"></span>Alerts</a></li>
+    <li><a href="../../metron-interface/metron-config/index.html" title="Config"><span class="none"></span>Config</a></li>
+    <li><a href="../../metron-interface/metron-rest/index.html" title="Rest"><span class="none"></span>Rest</a></li>
+    <li><a href="../../metron-platform/index.html" title="Platform"><span class="icon-chevron-down"></span>Platform</a>
+    <ul class="nav nav-list">
+    <li><a href="../../metron-platform/Performance-tuning-guide.html" title="Performance-tuning-guide"><span class="none"></span>Performance-tuning-guide</a></li>
+    <li><a href="../../metron-platform/metron-api/index.html" title="Api"><span class="none"></span>Api</a></li>
+    <li><a href="../../metron-platform/metron-common/index.html" title="Common"><span class="none"></span>Common</a></li>
+    <li><a href="../../metron-platform/metron-data-management/index.html" title="Data-management"><span class="none"></span>Data-management</a></li>
+    <li class="active"><a href="#"><span class="none"></span>Elasticsearch</a></li>
+    <li><a href="../../metron-platform/metron-enrichment/index.html" title="Enrichment"><span class="icon-chevron-right"></span>Enrichment</a></li>
+    <li><a href="../../metron-platform/metron-indexing/index.html" title="Indexing"><span class="none"></span>Indexing</a></li>
+    <li><a href="../../metron-platform/metron-management/index.html" title="Management"><span class="none"></span>Management</a></li>
+    <li><a href="../../metron-platform/metron-parsers/index.html" title="Parsers"><span class="icon-chevron-right"></span>Parsers</a></li>
+    <li><a href="../../metron-platform/metron-pcap-backend/index.html" title="Pcap-backend"><span class="none"></span>Pcap-backend</a></li>
+    <li><a href="../../metron-platform/metron-writer/index.html" title="Writer"><span class="none"></span>Writer</a></li>
+    </ul>
+</li>
+    <li><a href="../../metron-sensors/index.html" title="Sensors"><span class="icon-chevron-right"></span>Sensors</a></li>
+    <li><a href="../../metron-stellar/stellar-3rd-party-example/index.html" title="Stellar-3rd-party-example"><span class="none"></span>Stellar-3rd-party-example</a></li>
+    <li><a href="../../metron-stellar/stellar-common/index.html" title="Stellar-common"><span class="icon-chevron-right"></span>Stellar-common</a></li>
+    <li><a href="../../metron-stellar/stellar-zeppelin/index.html" title="Stellar-zeppelin"><span class="none"></span>Stellar-zeppelin</a></li>
+    <li><a href="../../use-cases/index.html" title="Use-cases"><span class="icon-chevron-right"></span>Use-cases</a></li>
+    </ul>
+</li>
+</ul>
+          <hr />
+          <div id="poweredBy">
+            <div class="clear"></div>
+            <div class="clear"></div>
+            <div class="clear"></div>
+            <div class="clear"></div>
+<a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy"><img class="builtBy" alt="Built by Maven" src="../../images/logos/maven-feather.png" /></a>
+            </div>
           </div>
         </div>
-        
-                
-        <div id="bodyColumn"  class="span9" >
-                                  
-            <h1>Elasticsearch in Metron</h1>
+        <div id="bodyColumn"  class="span10" >
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+<h1>Elasticsearch in Metron</h1>
 <p><a name="Elasticsearch_in_Metron"></a></p>
 <div class="section">
+<h2><a name="Table_of_Contents"></a>Table of Contents</h2>
+<ul>
+
+<li><a href="#Introduction">Introduction</a></li>
+<li><a href="#Properties">Properties</a></li>
+<li><a href="#Upgrading_to_5.6.2">Upgrading to 5.6.2</a></li>
+<li><a href="#Type_Mappings">Type Mappings</a></li>
+<li><a href="#Using_Metron_with_Elasticsearch_5.6.2">Using Metron with Elasticsearch 5.6.2</a></li>
+<li><a href="#Installing_Elasticsearch_Templates">Installing Elasticsearch Templates</a></li>
+</ul></div>
+<div class="section">
 <h2><a name="Introduction"></a>Introduction</h2>
 <p>Elasticsearch can be used as the real-time portion of the datastore resulting from <a href="../metron-indexing/index.html">metron-indexing</a>.</p></div>
 <div class="section">
 <h2><a name="Properties"></a>Properties</h2>
 <div class="section">
 <h3><a name="es.clustername"></a><tt>es.clustername</tt></h3>
-<p>The name of the elasticsearch Cluster. See <a class="externalLink" href="https://www.elastic.co/guide/en/elasticsearch/reference/current/important-settings.html#cluster.name">here</a></p></div>
+<p>The name of the elasticsearch Cluster.  See <a class="externalLink" href="https://www.elastic.co/guide/en/elasticsearch/reference/current/important-settings.html#cluster.name">here</a></p></div>
 <div class="section">
 <h3><a name="es.ip"></a><tt>es.ip</tt></h3>
 <p>Specifies the nodes in the elasticsearch cluster to use for writing. The format is one of the following:</p>
-
 <ul>
-  
+
 <li>A hostname or IP address with a port (e.g. <tt>hostname1:1234</tt>), in which case <tt>es.port</tt> is ignored.</li>
-  
 <li>A hostname or IP address without a port (e.g. <tt>hostname1</tt>), in which case <tt>es.port</tt> is used.</li>
-  
-<li>A string containing a CSV of hostnames without ports (e.g. <tt>hostname1,hostname2,hostname3</tt>) without spaces between. <tt>es.port</tt> is assumed to be the port for each host.</li>
-  
-<li>A string containing a CSV of hostnames with ports (e.g. <tt>hostname1:1234,hostname2:1234,hostname3:1234</tt>) without spaces between. <tt>es.port</tt> is ignored.</li>
-  
-<li>A list of hostnames with ports (e.g. <tt>[ &quot;hostname1:1234&quot;, &quot;hostname2:1234&quot;]</tt>). Note, <tt>es.port</tt> is NOT used in this construction.</li>
+<li>A string containing a CSV of hostnames without ports (e.g. <tt>hostname1,hostname2,hostname3</tt>) without spaces between.  <tt>es.port</tt> is assumed to be the port for each host.</li>
+<li>A string containing a CSV of hostnames with ports (e.g. <tt>hostname1:1234,hostname2:1234,hostname3:1234</tt>) without spaces between.  <tt>es.port</tt> is ignored.</li>
+<li>A list of hostnames with ports (e.g. <tt>[ &quot;hostname1:1234&quot;, &quot;hostname2:1234&quot;]</tt>).  Note, <tt>es.port</tt> is NOT used in this construction.</li>
 </ul></div>
 <div class="section">
 <h3><a name="es.port"></a><tt>es.port</tt></h3>
-<p>The port for the elasticsearch hosts. This will be used in accordance with the discussion of <tt>es.ip</tt>.</p></div>
+<p>The port for the elasticsearch hosts.  This will be used in accordance with the discussion of <tt>es.ip</tt>.</p></div>
 <div class="section">
 <h3><a name="es.date.format"></a><tt>es.date.format</tt></h3>
-<p>The date format to use when constructing the indices. For every message, the date format will be applied to the current time and that will become the last part of the index name where the message is written to.</p>
+<p>The date format to use when constructing the indices.  For every message, the date format will be applied to the current time and that will become the last part of the index name where the message is written to.</p>
 <p>For instance, an <tt>es.date.format</tt> of <tt>yyyy.MM.dd.HH</tt> would have the consequence that the indices would roll hourly, whereas an <tt>es.date.format</tt> of <tt>yyyy.MM.dd</tt> would have the consequence that the indices would roll daily.</p></div></div>
 <div class="section">
-<h2><a name="Using_Metron_with_Elasticsearch_2.x"></a>Using Metron with Elasticsearch 2.x</h2>
-<p>With Elasticsearch 2.x, there is a requirement that all sensors templates have a nested alert field defined. This field is a dummy field, and will be obsolete in Elasticsearch 5.x. See <a class="externalLink" href="https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-sort.html#_ignoring_unmapped_fields">Ignoring Unmapped Fields</a> for more information</p>
+<h2><a name="Upgrading_to_5.6.2"></a>Upgrading to 5.6.2</h2>
+<p>Users should be prepared to re-index when migrating from Elasticsearch 2.3.3 to 5.6.2. There are a number of template changes, most notably around string type handling, that may cause issues when upgrading.</p>
+<p><a class="externalLink" href="https://www.elastic.co/guide/en/elasticsearch/reference/5.6/setup-upgrade.html">https://www.elastic.co/guide/en/elasticsearch/reference/5.6/setup-upgrade.html</a></p>
+<p>Be aware that if you add a new string value and want to be able to filter and search on this value from the Alerts UI, you <b>must</b> add a mapping for that type to the appropriate Elasticsearch template. Below is more detail on how to choose the appropriate mapping type for your string value.</p></div>
+<div class="section">
+<h2><a name="Type_Mappings"></a>Type Mappings</h2>
+<p>Type mappings have changed quite a bit from ES 2.x -&gt; 5.x. Here is a brief rundown of the biggest changes. More detailed references from Elasticsearch are provided in the <a href="#Type_Mapping_References">Type Mapping References</a> section below.</p>
+<ul>
+
+<li>string fields replaced by text/keyword type</li>
+<li>strings have new default mappings as follows
+
+<div>
+<div>
+<pre class="source">{
+  &quot;type&quot;: &quot;text&quot;,
+  &quot;fields&quot;: {
+    &quot;keyword&quot;: {
+      &quot;type&quot;: &quot;keyword&quot;,
+      &quot;ignore_above&quot;: 256
+    }
+  }
+}
+</pre></div></div>
+</li>
+<li>
+
+<p>There is no longer a <tt>_timestamp</tt> field that you can set &#x201c;enabled&#x201d; on. This field now causes an exception on templates. Replace with an application-created timestamp of &#x201c;date&#x201d; type.</p>
+</li>
+</ul>
+<p>The semantics for string types have changed. In 2.x, you have the concept of index settings as either &#x201c;analyzed&#x201d; or &#x201c;not_analyzed&#x201d; which basically means &#x201c;full text&#x201d; and &#x201c;keyword&#x201d;, respectively. Analyzed text basically means the indexer will split the text using a text analyzer thus allowing you to search on substrings within the original text. &#x201c;New York&#x201d; is split and indexed as two buckets, &#x201c;New&#x201d; and &#x201c;York&#x201d;, so you can search or query for aggregate counts for those terms independently and will match against the individual terms &#x201c;New&#x201d; or &#x201c;York.&#x201d; &#x201c;Keyword&#x201d; means that the original text will not be split/analyzed during indexing and instead treated as a whole unit, i.e. &#x201c;New&#x201d; or &#x201c;York&#x201d; will not match in searches against the document containing &#x201c;New York&#x201d;, but searching on &#x201c;New York&#x201d; as the f
 ull city name will. In 5.x language instead of using the &#x201c;index&#x201d; setting, you now set the &#x201c;type&#x201d; to either &#x201c;text&#x201d; for full text, or &#x201c;keyword&#x201d; for keywords.</p>
+<p>Below is a table depicting the changes to how String types are now handled.</p>
+
+<table border="0" class="table table-striped">
+
+<tr class="a">
+	
+<th>sort, aggregate, or access values</th>
+	
+<th>ES 2.x</th>
+	
+<th>ES 5.x</th>
+	
+<th>Example</th>
+</tr>
+
+<tr class="b">
+	
+<td>no</td>
+	
+<td>
+
+<div>
+<pre><tt>&quot;my_property&quot; : {
+  &quot;type&quot;: &quot;string&quot;,
+  &quot;index&quot;: &quot;analyzed&quot;
+}
+</tt></pre></div>
+	</td>
+	
+<td>
+
+<div>
+<pre><tt>&quot;my_property&quot; : {
+  &quot;type&quot;: &quot;text&quot;
+}
+</tt></pre></div>
+    Additional defaults: &quot;index&quot;: &quot;true&quot;, &quot;fielddata&quot;: &quot;false&quot;
+	</td>
+	
+<td>
+		&quot;New York&quot; handled via in-mem search as &quot;New&quot; and &quot;York&quot; buckets. <b>No</b> aggregation or sort.
+	</td>
+</tr>
+
+<tr class="a">
+	
+<td>
+	yes
+	</td>
+	
+<td>
+
+<div>
+<pre><tt>&quot;my_property&quot;: {
+  &quot;type&quot;: &quot;string&quot;,
+  &quot;index&quot;: &quot;analyzed&quot;
+}
+</tt></pre></div>
+	</td>
+	
+<td>
+
+<div>
+<pre><tt>&quot;my_property&quot;: {
+  &quot;type&quot;: &quot;text&quot;,
+  &quot;fielddata&quot;: &quot;true&quot;
+}
+</tt></pre></div>
+	</td>
+	
+<td>
+	&quot;New York&quot; handled via in-mem search as &quot;New&quot; and &quot;York&quot; buckets. <b>Can</b> aggregate and sort.
+	</td>
+</tr>
+
+<tr class="b">
+	
+<td>
+	yes
+	</td>
+	
+<td>
+
+<div>
+<pre><tt>&quot;my_property&quot;: {
+  &quot;type&quot;: &quot;string&quot;,
+  &quot;index&quot;: &quot;not_analyzed&quot;
+}
+</tt></pre></div>
+	</td>
+	
+<td>
+
+<div>
+<pre><tt>&quot;my_property&quot; : {
+  &quot;type&quot;: &quot;keyword&quot;
+}
+</tt></pre></div>
+	</td>
+	
+<td>
+	&quot;New York&quot; searchable as single value. <b>Can</b> aggregate and sort. A search for &quot;New&quot; or &quot;York&quot; will not match against the whole value.
+	</td>
+</tr>
+
+<tr class="a">
+	
+<td>
+	yes
+	</td>
+	
+<td>
+
+<div>
+<pre><tt>&quot;my_property&quot;: {
+  &quot;type&quot;: &quot;string&quot;,
+  &quot;index&quot;: &quot;analyzed&quot;
+}
+</tt></pre></div>
+	</td>
+	
+<td>
+
+<div>
+<pre><tt>&quot;my_property&quot;: {
+  &quot;type&quot;: &quot;text&quot;,
+  &quot;fields&quot;: {
+    &quot;keyword&quot;: {
+      &quot;type&quot;: &quot;keyword&quot;,
+      &quot;ignore_above&quot;: 256
+    }
+  }
+}
+</tt></pre></div>
+	</td>
+	
+<td>
+	&quot;New York&quot; searchable as single value or as text document, can aggregate and sort on the sub term &quot;keyword.&quot;
+	</td>
+</tr>
+</table>
+
+<p>If you want to set default string behavior for all strings for a given index and type, you can do so with a mapping similar to the following (replace ${your_type_here} accordingly):</p>
+
+<div>
+<div>
+<pre class="source"># curl -XPUT 'http://${ES_HOST}:${ES_PORT}/_template/default_string_template' -d '
+{
+  &quot;template&quot;: &quot;*&quot;,
+  &quot;mappings&quot; : {
+    &quot;${your_type_here}&quot;: {
+      &quot;dynamic_templates&quot;: [
+        {
+          &quot;strings&quot;: {
+            &quot;match_mapping_type&quot;: &quot;string&quot;,
+            &quot;mapping&quot;: {
+              &quot;type&quot;: &quot;text&quot;
+            }
+          }
+        }
+      ]
+    }
+  }
+}
+'
+</pre></div></div>
+
+<p>By specifying the &#x201c;template&#x201d; property with value &#x201c;*&#x201d; the template will apply to all indexes that have documents indexed of the specified type (${your_type_here}). This results in the following template.</p>
+
+<div>
+<div>
+<pre class="source"># curl -XGET 'http://${ES_HOST}:${ES_PORT}/_template/default_string_template?pretty'
+{
+  &quot;default_string_template&quot; : {
+    &quot;order&quot; : 0,
+    &quot;template&quot; : &quot;*&quot;,
+    &quot;settings&quot; : { },
+    &quot;mappings&quot; : {
+      &quot;${your_type_here}&quot; : {
+        &quot;dynamic_templates&quot; : [
+          {
+            &quot;strings&quot; : {
+              &quot;match_mapping_type&quot; : &quot;string&quot;,
+              &quot;mapping&quot; : {
+                &quot;type&quot; : &quot;text&quot;
+              }
+            }
+          }
+        ]
+      }
+    },
+    &quot;aliases&quot; : { }
+  }
+}
+</pre></div></div>
+
+<p>Notes on other settings for types in ES</p>
+<ul>
+
+<li>doc_values
+<ul>
+
+<li>on-disk data structure</li>
+<li>provides access for sorting, aggregation, and field values</li>
+<li>stores same values as _source, but in column-oriented fashion better for sorting and aggregating</li>
+<li>not supported on text fields</li>
+<li>enabled by default</li>
+</ul>
+</li>
+<li>fielddata
+<ul>
+
+<li>in-memory data structure</li>
+<li>provides access for sorting, aggregation, and field values</li>
+<li>primarily for text fields</li>
+<li>disabled by default because the heap space required can be large</li>
+</ul>
+</li>
+</ul>
+<div class="section">
+<div class="section">
+<div class="section">
+<h5><a name="Type_Mapping_References"></a>Type Mapping References</h5>
+<ul>
+
+<li><a class="externalLink" href="https://www.elastic.co/guide/en/elasticsearch/reference/5.6/mapping.html">https://www.elastic.co/guide/en/elasticsearch/reference/5.6/mapping.html</a></li>
+<li><a class="externalLink" href="https://www.elastic.co/guide/en/elasticsearch/reference/5.6/breaking_50_mapping_changes.html">https://www.elastic.co/guide/en/elasticsearch/reference/5.6/breaking_50_mapping_changes.html</a></li>
+<li><a class="externalLink" href="https://www.elastic.co/blog/strings-are-dead-long-live-strings">https://www.elastic.co/blog/strings-are-dead-long-live-strings</a></li>
+</ul></div></div></div></div>
+<div class="section">
+<h2><a name="Using_Metron_with_Elasticsearch_5.6.2"></a>Using Metron with Elasticsearch 5.6.2</h2>
+<p>There is a requirement that all sensors templates have a nested alert field defined.  This field is a dummy field.  See <a class="externalLink" href="https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-sort.html#_ignoring_unmapped_fields">Ignoring Unmapped Fields</a> for more information</p>
 <p>Without this field, an error will be thrown during ALL searches (including from UIs, resulting in no alerts being found for any sensor). This error will be found in the REST service&#x2019;s logs.</p>
 <p>Exception seen:</p>
 
-<div class="source">
-<div class="source">
-<pre>QueryParsingException[[nested] failed to find nested object under path [alert]];
+<div>
+<div>
+<pre class="source">QueryParsingException[[nested] failed to find nested object under path [alert]];
 </pre></div></div>
-<p>There are two steps to resolve this issue. First is to update the Elasticsearch template for each sensor, so any new indices have the field. This requires retrieving the template, removing an extraneous JSON field so we can put it back later, and adding our new field.</p>
-<p>Make sure to set the ELASTICSEARCH variable appropriately. $SENSOR can contain wildcards, so if rollover has occurred, it&#x2019;s not necessary to do each index individually. The example here appends <tt>index*</tt> to get all indexes for a the provided sensor.</p>
 
-<div class="source">
-<div class="source">
-<pre>export ELASTICSEARCH=&quot;node1&quot;
+<p>There are two steps to resolve this issue.  First is to update the Elasticsearch template for each sensor, so any new indices have the field. This requires retrieving the template, removing an extraneous JSON field so we can put it back later, and adding our new field.</p>
+<p>Make sure to set the ELASTICSEARCH variable appropriately. $SENSOR can contain wildcards, so if rollover has occurred, it&#x2019;s not necessary to do each index individually. The example here appends <tt>index*</tt> to get all indexes for the provided sensor.</p>
+
+<div>
+<div>
+<pre class="source">export ELASTICSEARCH=&quot;node1&quot;
 export SENSOR=&quot;bro&quot;
 curl -XGET &quot;http://${ELASTICSEARCH}:9200/_template/${SENSOR}_index*?pretty=true&quot; -o &quot;${SENSOR}.template&quot;
 sed -i '' '2d;$d' ./${SENSOR}.template
 sed -i '' '/&quot;properties&quot; : {/ a\
 &quot;alert&quot;: { &quot;type&quot;: &quot;nested&quot;},' ${SENSOR}.template
 </pre></div></div>
+
 <p>To manually verify this, you can optionally pretty print it again with:</p>
 
-<div class="source">
-<div class="source">
-<pre>python -m json.tool bro.template
+<div>
+<div>
+<pre class="source">python -m json.tool bro.template
 </pre></div></div>
+
 <p>We&#x2019;ll want to put the template back into Elasticsearch:</p>
 
-<div class="source">
-<div class="source">
-<pre>curl -XPUT &quot;http://${ELASTICSEARCH}:9200/_template/${SENSOR}_index&quot; -d @${SENSOR}.template
+<div>
+<div>
+<pre class="source">curl -XPUT &quot;http://${ELASTICSEARCH}:9200/_template/${SENSOR}_index&quot; -d @${SENSOR}.template
 </pre></div></div>
-<p>To update existing indexes, update Elasticsearch mappings with the new field for each sensor. </p>
 
-<div class="source">
-<div class="source">
-<pre>curl -XPUT &quot;http://${ELASTICSEARCH}:9200/${SENSOR}_index*/_mapping/${SENSOR}_doc&quot; -d '
+<p>To update existing indexes, update Elasticsearch mappings with the new field for each sensor.</p>
+
+<div>
+<div>
+<pre class="source">curl -XPUT &quot;http://${ELASTICSEARCH}:9200/${SENSOR}_index*/_mapping/${SENSOR}_doc&quot; -d '
 {
-        &quot;properties&quot; : {
-          &quot;alert&quot; : {
-            &quot;type&quot; : &quot;nested&quot;
-          }
-        }
+  &quot;properties&quot; : {
+    &quot;alert&quot; : {
+      &quot;type&quot; : &quot;nested&quot;
+    }
+  }
 }
 '
 rm ${SENSOR}.template
-</pre></div></div></div>
+</pre></div></div>
+</div>
 <div class="section">
 <h2><a name="Installing_Elasticsearch_Templates"></a>Installing Elasticsearch Templates</h2>
 <p>The stock set of Elasticsearch templates for bro, snort, yaf, error index and meta index are installed automatically during the first time install and startup of Metron Indexing service.</p>
-<p>It is possible that Elasticsearch service is not available when the Metron Indexing Service startup, in that case the Elasticsearch template will not be installed. </p>
+<p>It is possible that Elasticsearch service is not available when the Metron Indexing Service startup, in that case the Elasticsearch template will not be installed.</p>
 <p>For such a scenario, an Admin can have the template installed in two ways:</p>
 <p><i>Method 1</i> - Manually from the Ambari UI by following the flow: Ambari UI -&gt; Services -&gt; Metron -&gt; Service Actions -&gt; Elasticsearch Template Install</p>
 <p><i>Method 2</i> - Stop the Metron Indexing service, and start it again from Ambari UI. Note that the Metron Indexing service tracks if it has successfully installed the Elasticsearch templates, and will attempt to do so each time it is Started until successful.</p>
-
 <blockquote>
+
 <p>Note: If you have made any customization to your index templates, then installing Elasticsearch templates afresh will lead to overwriting your existing changes. Please exercise caution.</p>
 </blockquote></div>
-                  </div>
-            </div>
-          </div>
-
+        </div>
+      </div>
+    </div>
     <hr/>
-
     <footer>
-            <div class="container-fluid">
-              <div class="row span12">Copyright &copy;                    2018
-                        <a href="https://www.apache.org">The Apache Software Foundation</a>.
-            All Rights Reserved.      
-                    
+      <div class="container-fluid">
+        <div class="row-fluid">
+© 2015-2016 The Apache Software Foundation. Apache Metron, Metron, Apache, the Apache feather logo,
+            and the Apache Metron project logo are trademarks of The Apache Software Foundation.
+        </div>
       </div>
-
-                          
-        
-                </div>
     </footer>
   </body>
 </html>

http://git-wip-us.apache.org/repos/asf/metron/blob/ae1d3eb9/site/current-book/metron-platform/metron-enrichment/Performance.html
----------------------------------------------------------------------
diff --git a/site/current-book/metron-platform/metron-enrichment/Performance.html b/site/current-book/metron-platform/metron-enrichment/Performance.html
new file mode 100644
index 0000000..136d939
--- /dev/null
+++ b/site/current-book/metron-platform/metron-enrichment/Performance.html
@@ -0,0 +1,802 @@
+<!DOCTYPE html>
+<!--
+ | Generated by Apache Maven Doxia Site Renderer 1.8 from src/site/markdown/metron-platform/metron-enrichment/Performance.md at 2018-06-07
+ | Rendered using Apache Maven Fluido Skin 1.7
+-->
+<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
+  <head>
+    <meta charset="UTF-8" />
+    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
+    <meta name="Date-Revision-yyyymmdd" content="20180607" />
+    <meta http-equiv="Content-Language" content="en" />
+    <title>Metron &#x2013; Enrichment Performance</title>
+    <link rel="stylesheet" href="../../css/apache-maven-fluido-1.7.min.css" />
+    <link rel="stylesheet" href="../../css/site.css" />
+    <link rel="stylesheet" href="../../css/print.css" media="print" />
+    <script type="text/javascript" src="../../js/apache-maven-fluido-1.7.min.js"></script>
+<script type="text/javascript">
+              $( document ).ready( function() { $( '.carousel' ).carousel( { interval: 3500 } ) } );
+            </script>
+  </head>
+  <body class="topBarDisabled">
+    <div class="container-fluid">
+      <div id="banner">
+        <div class="pull-left"><a href="http://metron.apache.org/" id="bannerLeft"><img src="../../images/metron-logo.png"  alt="Apache Metron" width="148px" height="48px"/></a></div>
+        <div class="pull-right"></div>
+        <div class="clear"><hr/></div>
+      </div>
+
+      <div id="breadcrumbs">
+        <ul class="breadcrumb">
+      <li class=""><a href="http://www.apache.org" class="externalLink" title="Apache">Apache</a><span class="divider">/</span></li>
+      <li class=""><a href="http://metron.apache.org/" class="externalLink" title="Metron">Metron</a><span class="divider">/</span></li>
+      <li class=""><a href="../../index.html" title="Documentation">Documentation</a><span class="divider">/</span></li>
+    <li class="active ">Enrichment Performance</li>
+        <li id="publishDate" class="pull-right"><span class="divider">|</span> Last Published: 2018-06-07</li>
+          <li id="projectVersion" class="pull-right">Version: 0.5.0</li>
+        </ul>
+      </div>
+      <div class="row-fluid">
+        <div id="leftColumn" class="span2">
+          <div class="well sidebar-nav">
+    <ul class="nav nav-list">
+      <li class="nav-header">User Documentation</li>
+    <li><a href="../../index.html" title="Metron"><span class="icon-chevron-down"></span>Metron</a>
+    <ul class="nav nav-list">
+    <li><a href="../../CONTRIBUTING.html" title="CONTRIBUTING"><span class="none"></span>CONTRIBUTING</a></li>
+    <li><a href="../../Upgrading.html" title="Upgrading"><span class="none"></span>Upgrading</a></li>
+    <li><a href="../../metron-analytics/index.html" title="Analytics"><span class="icon-chevron-right"></span>Analytics</a></li>
+    <li><a href="../../metron-contrib/metron-docker/index.html" title="Docker"><span class="none"></span>Docker</a></li>
+    <li><a href="../../metron-contrib/metron-performance/index.html" title="Performance"><span class="none"></span>Performance</a></li>
+    <li><a href="../../metron-deployment/index.html" title="Deployment"><span class="icon-chevron-right"></span>Deployment</a></li>
+    <li><a href="../../metron-interface/metron-alerts/index.html" title="Alerts"><span class="none"></span>Alerts</a></li>
+    <li><a href="../../metron-interface/metron-config/index.html" title="Config"><span class="none"></span>Config</a></li>
+    <li><a href="../../metron-interface/metron-rest/index.html" title="Rest"><span class="none"></span>Rest</a></li>
+    <li><a href="../../metron-platform/index.html" title="Platform"><span class="icon-chevron-down"></span>Platform</a>
+    <ul class="nav nav-list">
+    <li><a href="../../metron-platform/Performance-tuning-guide.html" title="Performance-tuning-guide"><span class="none"></span>Performance-tuning-guide</a></li>
+    <li><a href="../../metron-platform/metron-api/index.html" title="Api"><span class="none"></span>Api</a></li>
+    <li><a href="../../metron-platform/metron-common/index.html" title="Common"><span class="none"></span>Common</a></li>
+    <li><a href="../../metron-platform/metron-data-management/index.html" title="Data-management"><span class="none"></span>Data-management</a></li>
+    <li><a href="../../metron-platform/metron-elasticsearch/index.html" title="Elasticsearch"><span class="none"></span>Elasticsearch</a></li>
+    <li><a href="../../metron-platform/metron-enrichment/index.html" title="Enrichment"><span class="icon-chevron-down"></span>Enrichment</a>
+    <ul class="nav nav-list">
+    <li class="active"><a href="#"><span class="none"></span>Performance</a></li>
+    </ul>
+</li>
+    <li><a href="../../metron-platform/metron-indexing/index.html" title="Indexing"><span class="none"></span>Indexing</a></li>
+    <li><a href="../../metron-platform/metron-management/index.html" title="Management"><span class="none"></span>Management</a></li>
+    <li><a href="../../metron-platform/metron-parsers/index.html" title="Parsers"><span class="icon-chevron-right"></span>Parsers</a></li>
+    <li><a href="../../metron-platform/metron-pcap-backend/index.html" title="Pcap-backend"><span class="none"></span>Pcap-backend</a></li>
+    <li><a href="../../metron-platform/metron-writer/index.html" title="Writer"><span class="none"></span>Writer</a></li>
+    </ul>
+</li>
+    <li><a href="../../metron-sensors/index.html" title="Sensors"><span class="icon-chevron-right"></span>Sensors</a></li>
+    <li><a href="../../metron-stellar/stellar-3rd-party-example/index.html" title="Stellar-3rd-party-example"><span class="none"></span>Stellar-3rd-party-example</a></li>
+    <li><a href="../../metron-stellar/stellar-common/index.html" title="Stellar-common"><span class="icon-chevron-right"></span>Stellar-common</a></li>
+    <li><a href="../../metron-stellar/stellar-zeppelin/index.html" title="Stellar-zeppelin"><span class="none"></span>Stellar-zeppelin</a></li>
+    <li><a href="../../use-cases/index.html" title="Use-cases"><span class="icon-chevron-right"></span>Use-cases</a></li>
+    </ul>
+</li>
+</ul>
+          <hr />
+          <div id="poweredBy">
+            <div class="clear"></div>
+            <div class="clear"></div>
+            <div class="clear"></div>
+            <div class="clear"></div>
+<a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy"><img class="builtBy" alt="Built by Maven" src="../../images/logos/maven-feather.png" /></a>
+            </div>
+          </div>
+        </div>
+        <div id="bodyColumn"  class="span10" >
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+<h1>Enrichment Performance</h1>
+<p><a name="Enrichment_Performance"></a></p>
+<p>This guide defines a set of benchmarks used to measure the performance of the Enrichment topology.  The guide also provides detailed steps on how to execute those benchmarks along with advice for tuning the Unified Enrichment topology.</p>
+<ul>
+
+<li><a href="#Benchmarks">Benchmarks</a></li>
+<li><a href="#Benchmark_Execution">Benchmark Execution</a></li>
+<li><a href="#Performance_Tuning">Performance Tuning</a></li>
+<li><a href="#Benchmark_Results">Benchmark Results</a></li>
+</ul>
+<div class="section">
+<h2><a name="Benchmarks"></a>Benchmarks</h2>
+<p>The following section describes a set of enrichments that will be used to benchmark the performance of the Enrichment topology.</p>
+<ul>
+
+<li><a href="#Geo_IP_Enrichment">Geo IP Enrichment</a></li>
+<li><a href="#HBase_Enrichment">HBase Enrichment</a></li>
+<li><a href="#Stellar_Enrichment">Stellar Enrichment</a></li>
+</ul>
+<div class="section">
+<h3><a name="Geo_IP_Enrichment"></a>Geo IP Enrichment</h3>
+<p>This benchmark measures the performance of executing a Geo IP enrichment.  Given a valid IP address the enrichment will append detailed location information for that IP.  The location information is sourced from an external Geo IP data source like <a class="externalLink" href="https://github.com/maxmind/GeoIP2-java">Maxmind</a>.</p>
+<div class="section">
+<h4><a name="Configuration"></a>Configuration</h4>
+<p>Adding the following Stellar expression to the Enrichment topology configuration will define a Geo IP enrichment.</p>
+
+<div>
+<div>
+<pre class="source">geo := GEO_GET(ip_dst_addr)
+</pre></div></div>
+
+<p>After the enrichment process completes, the  telemetry message will contain a set of fields with location information for the given IP address.</p>
+
+<div>
+<div>
+<pre class="source">{
+   &quot;ip_dst_addr&quot;:&quot;151.101.129.140&quot;,
+   ...
+   &quot;geo.city&quot;:&quot;San Francisco&quot;,
+   &quot;geo.country&quot;:&quot;US&quot;,
+   &quot;geo.dmaCode&quot;:&quot;807&quot;,
+   &quot;geo.latitude&quot;:&quot;37.7697&quot;,
+   &quot;geo.location_point&quot;:&quot;37.7697,-122.3933&quot;,
+   &quot;geo.locID&quot;:&quot;5391959&quot;,
+   &quot;geo.longitude&quot;:&quot;-122.3933&quot;,
+   &quot;geo.postalCode&quot;:&quot;94107&quot;,
+ }
+</pre></div></div>
+</div></div>
+<div class="section">
+<h3><a name="HBase_Enrichment"></a>HBase Enrichment</h3>
+<p>This benchmark measures the performance of executing an enrichment that retrieves data from an external HBase table. This type of enrichment is useful for enriching telemetry from an Asset Database or other source of relatively static data.</p>
+<div class="section">
+<h4><a name="Configuration"></a>Configuration</h4>
+<p>Adding the following Stellar expression to the Enrichment topology configuration will define an Hbase enrichment.  This looks up the &#x2018;ip_dst_addr&#x2019; within an HBase table &#x2018;top-1m&#x2019; and returns a hostname.</p>
+
+<div>
+<div>
+<pre class="source">top1m := ENRICHMENT_GET('top-1m', ip_dst_addr, 'top-1m', 't')
+</pre></div></div>
+
+<p>After the telemetry has been enriched, it will contain the host and IP elements that were retrieved from the HBase table.</p>
+
+<div>
+<div>
+<pre class="source">{
+	&quot;ip_dst_addr&quot;:&quot;151.101.2.166&quot;,
+	...
+	&quot;top1m.host&quot;:&quot;earther.com&quot;,
+	&quot;top1m.ip&quot;:&quot;151.101.2.166&quot;
+}
+</pre></div></div>
+</div></div>
+<div class="section">
+<h3><a name="Stellar_Enrichment"></a>Stellar Enrichment</h3>
+<p>This benchmark measures the performance of executing a basic Stellar expression.  In this benchmark, the enrichment is purely a computational task that has no dependence on an external system like a database.</p>
+<div class="section">
+<h4><a name="Configuration"></a>Configuration</h4>
+<p>Adding the following Stellar expression to the Enrichment topology configuration will define a basic Stellar enrichment.  The following returns true if the IP is in the given subnet and false otherwise.</p>
+
+<div>
+<div>
+<pre class="source">local := IN_SUBNET(ip_dst_addr, '192.168.0.0/24')
+</pre></div></div>
+
+<p>After the telemetry has been enriched, it will contain a field with a boolean value indicating whether the IP was within the given subnet.</p>
+
+<div>
+<div>
+<pre class="source">{
+	&quot;ip_dst_addr&quot;:&quot;151.101.2.166&quot;,
+	...
+	&quot;local&quot;:false
+}
+</pre></div></div>
+</div></div></div>
+<div class="section">
+<h2><a name="Benchmark_Execution"></a>Benchmark Execution</h2>
+<p>This section describes the steps necessary to execute the performance benchmarks for the Enrichment topology.</p>
+<ul>
+
+<li><a href="#Prepare_Enrichment_Data">Prepare Enrichment Data</a></li>
+<li><a href="#Load_HBase_with_Enrichment_Data">Load HBase with Enrichment Data</a></li>
+<li><a href="#Configure_the_Enrichments">Configure the Enrichments</a></li>
+<li><a href="#Create_Input_Telemetry">Create Input Telemetry</a></li>
+<li><a href="#Cluster_Setup">Cluster Setup</a></li>
+<li><a href="#Monitoring">Monitoring</a></li>
+</ul>
+<div class="section">
+<h3><a name="Prepare_Enrichment_Data"></a>Prepare Enrichment Data</h3>
+<p>The Alexa Top 1 Million was used as a data source for these benchmarks.</p>
+<ol style="list-style-type: decimal">
+
+<li>
+
+<p>Download the <a class="externalLink" href="http://s3.amazonaws.com/alexa-static/top-1m.csv.zip">Alexa Top 1 Million</a> or another similar data set with a variety of valid hostnames.</p>
+</li>
+<li>
+
+<p>For each hostname, query DNS to retrieve an associated IP address.</p>
+<p>A script like the following can be used for this.  There is no need to do this for all 1 million entries in the data set. Doing this for around 10,000 records is sufficient.</p>
+
+<div>
+<div>
+<pre class="source">import dns.resolver
+import csv
+#
+resolver = dns.resolver.Resolver()
+resolver.nameservers = ['8.8.8.8', '8.8.4.4']
+#
+with open('top-1m.csv', 'r') as infile:
+  with open('top-1m-with-ip.csv', 'w') as outfile:
+    #
+    reader = csv.reader(infile, delimiter=',')
+    writer = csv.writer(outfile, delimiter=',')
+    for row in reader:
+      #
+      host = row[1]
+      try:
+        response = resolver.query(host, &quot;A&quot;)
+        for record in response:
+          ip = record
+          writer.writerow([host, ip])
+          print &quot;host={}, ip={}&quot;.format(host, ip)
+        #
+      except:
+        pass
+</pre></div></div>
+</li>
+<li>
+
+<p>The resulting data set contains an IP to hostname mapping.</p>
+
+<div>
+<div>
+<pre class="source">$ head top-1m-with-ip.csv
+google.com,172.217.9.46
+youtube.com,172.217.4.78
+facebook.com,157.240.18.35
+baidu.com,220.181.57.216
+baidu.com,111.13.101.208
+baidu.com,123.125.114.144
+wikipedia.org,208.80.154.224
+yahoo.com,98.139.180.180
+yahoo.com,206.190.39.42
+reddit.com,151.101.1.140
+</pre></div></div>
+</li>
+</ol></div>
+<div class="section">
+<h3><a name="Load_HBase_with_Enrichment_Data"></a>Load HBase with Enrichment Data</h3>
+<ol style="list-style-type: decimal">
+
+<li>
+
+<p>Create an HBase table for this data.</p>
+<p>Ensure that the table is evenly distributed across the HBase nodes.  This can be done by pre-splitting the table or splitting the data after loading it.</p>
+
+<div>
+<div>
+<pre class="source">create 'top-1m', 't', {SPLITS =&gt; ['2','4','6','8','a','c','e']}
+</pre></div></div>
+</li>
+<li>
+
+<p>Create a configuration file called <tt>extractor.json</tt>.  This defines how the data will be loaded into the HBase table.</p>
+
+<div>
+<div>
+<pre class="source">&gt; cat extractor.json
+{
+    &quot;config&quot;: {
+        &quot;columns&quot;: {
+            &quot;host&quot; : 0,
+            &quot;ip&quot;: 1
+        },
+        &quot;indicator_column&quot;: &quot;ip&quot;,
+        &quot;type&quot;: &quot;top-1m&quot;,
+        &quot;separator&quot;: &quot;,&quot;
+    },
+    &quot;extractor&quot;: &quot;CSV&quot;
+}
+</pre></div></div>
+</li>
+<li>
+
+<p>Use the <tt>flatfile_loader.sh</tt> to load the data into the HBase table.</p>
+
+<div>
+<div>
+<pre class="source">$METRON_HOME/bin/flatfile_loader.sh \
+	-e extractor.json \
+	-t top-1m \
+	-c t \
+	-i top-1m-with-ip.csv
+</pre></div></div>
+</li>
+</ol></div>
+<div class="section">
+<h3><a name="Configure_the_Enrichments"></a>Configure the Enrichments</h3>
+<ol style="list-style-type: decimal">
+
+<li>Define the Enrichments using the REPL.
+
+<div>
+<div>
+<pre class="source">&gt; $METRON_HOME/bin/stellar -z $ZOOKEEPER
+Stellar, Go!
+[Stellar]&gt;&gt;&gt; conf
+{
+  &quot;enrichment&quot;: {
+    &quot;fieldMap&quot;: {
+     &quot;stellar&quot; : {
+       &quot;config&quot; : {
+         &quot;geo&quot; : &quot;GEO_GET(ip_dst_addr)&quot;,
+         &quot;top1m&quot; : &quot;ENRICHMENT_GET('top-1m', ip_dst_addr, 'top-1m', 't')&quot;,
+         &quot;local&quot; : &quot;IN_SUBNET(ip_dst_addr, '192.168.0.0/24')&quot;
+       }
+     }
+    },
+    &quot;fieldToTypeMap&quot;: {
+    }
+  },
+  &quot;threatIntel&quot;: {
+  }
+}
+[Stellar]&gt;&gt;&gt; CONFIG_PUT(&quot;ENRICHMENT&quot;, conf, &quot;asa&quot;)
+</pre></div></div>
+</li>
+</ol></div>
+<div class="section">
+<h3><a name="Create_Input_Telemetry"></a>Create Input Telemetry</h3>
+<ol style="list-style-type: decimal">
+
+<li>
+
+<p>Create a template file that defines what your input telemetry will look-like.</p>
+
+<div>
+<div>
+<pre class="source">&gt; cat asa.template
+{&quot;ciscotag&quot;: &quot;ASA-1-123123&quot;, &quot;source.type&quot;: &quot;asa&quot;, &quot;ip_dst_addr&quot;: &quot;$DST_ADDR&quot;, &quot;original_string&quot;: &quot;&lt;134&gt;Feb 22 17:04:43 AHOSTNAME %ASA-1-123123: Built inbound ICMP connection for faddr 192.168.11.8/50244 gaddr 192.168.1.236/0 laddr 192.168.1.1/161&quot;, &quot;ip_src_addr&quot;: &quot;192.168.1.35&quot;, &quot;syslog_facility&quot;: &quot;local1&quot;, &quot;action&quot;: &quot;built&quot;, &quot;syslog_host&quot;: &quot;AHOSTNAME&quot;, &quot;timestamp&quot;: &quot;$METRON_TS&quot;, &quot;protocol&quot;: &quot;icmp&quot;, &quot;guid&quot;: &quot;$METRON_GUID&quot;, &quot;syslog_severity&quot;: &quot;info&quot;}
+</pre></div></div>
+</li>
+<li>
+
+<p>Use the template file along with the enrichment data to create input telemetry with varying IP addresses.</p>
+
+<div>
+<div>
+<pre class="source">for i in $(head top-1m-with-ip.csv | awk -F, '{print $2}');do
+	cat asa.template | sed &quot;s/\$DST_ADDR/$i/&quot;;
+done &gt; asa.input.template
+</pre></div></div>
+</li>
+<li>
+
+<p>Use the <tt>load_tool.sh</tt> script to push messages onto the input topic <tt>enrichments</tt> and monitor the output topic <tt>indexing</tt>.  See more information in the Performance <a href="metron-contrib/metron-performance/index.html">README.md</a>.</p>
+<p>If the topology is keeping up, obviously the events per second produced on the input topic should roughly match the output topic.</p>
+</li>
+</ol></div>
+<div class="section">
+<h3><a name="Cluster_Setup"></a>Cluster Setup</h3>
+<div class="section">
+<h4><a name="Isolation"></a>Isolation</h4>
+<p>The Enrichment topology depends on an environment with at least two and often three components that work together; Storm, Kafka, and HBase.  When any of two of these are run on the same node, it can be difficult to identify which of them is becoming a bottleneck.  This can cause poor and highly volatile performance as each steals resources from the other.</p>
+<p>It is highly recommended that each of these systems be fully isolated from the others.  Storm should be run on nodes that are completely isolated from Kafka and HBase.</p></div></div>
+<div class="section">
+<h3><a name="Monitoring"></a>Monitoring</h3>
+<ol style="list-style-type: decimal">
+
+<li>
+
+<p>The <tt>load_test.sh</tt> script will report the throughput for the input and output topics.</p>
+<ul>
+
+<li>
+
+<p>The input throughput should roughly match the output throughput if the topology is able to handle a given load.</p>
+</li>
+<li>
+
+<p>Not only are the raw throughput numbers important, but also the consistency of what is reported over time.  If the reported throughput is sporadic, then further tuning may be required.</p>
+</li>
+</ul>
+</li>
+<li>
+
+<p>The Storm UI is obviously an important source of information.  The bolt capacity, complete latency, and any reported errors are all important to monitor</p>
+</li>
+<li>
+
+<p>The load reported by the OS is also an important metric to monitor.</p>
+<ul>
+
+<li>
+
+<p>The load metric should be monitored to ensure that each node is being pushed sufficiently, but not too much.</p>
+</li>
+<li>
+
+<p>The load should be evenly distributed across each node.  If the load is uneven, this may indicate a problem.</p>
+</li>
+</ul>
+<p>A simple script like the following is sufficient for the task.</p>
+
+<div>
+<div>
+<pre class="source">for host in $(cat cluster.txt); do
+  echo $host;
+  ssh root@$host 'uptime';
+done
+</pre></div></div>
+</li>
+<li>
+
+<p>Monitoring the Kafka offset lags indicates how far behind a consumer may be.  This can be very useful to determine if the topology is keeping up.</p>
+
+<div>
+<div>
+<pre class="source">${KAFKA_HOME}/bin/kafka-consumer-groups.sh \
+    --command-config=/tmp/consumergroup.config \
+    --describe \
+    --group enrichments \
+    --bootstrap-server $BROKERLIST \
+    --new-consumer
+</pre></div></div>
+</li>
+<li>
+
+<p>A tool like <a class="externalLink" href="https://github.com/yahoo/kafka-manager">Kafka Manager</a> is also very useful for monitoring the input and output topics during test execution.</p>
+</li>
+</ol></div></div>
+<div class="section">
+<h2><a name="Performance_Tuning"></a>Performance Tuning</h2>
+<p>The approach to tuning the topology will look something like the following.  More detailed tuning information is available next to each named parameter</p>
+<ul>
+
+<li>
+
+<p>Start the tuning process with a single worker.  After tuning the bolts within a single worker, scale out with additional worker processes.</p>
+</li>
+<li>
+
+<p>Initially set the thread pool size to 1.  Increase this value slowly only after tuning the other parameters first.  Consider that each worker has its own thread pool and the total size of this thread pool should be far less than the total number of cores available in the cluster.</p>
+</li>
+<li>
+
+<p>Initially set each bolt parallelism hint to the number of partitions on the input Kafka topic.  Monitor bolt capacity and increase the parallelism hint for any bolt whose capacity is close to or exceeds 1.</p>
+</li>
+<li>
+
+<p>If the topology is not able to keep-up with a given input, then increasing the parallelism is the primary means to scale up.</p>
+</li>
+<li>
+
+<p>Parallelism units can be used for determining how to distribute processing tasks across the topology.  The sum of parallelism can be close to, but should not far exceed this value.</p>
+<p>(number of worker nodes in cluster * number cores per worker node) - (number of acker tasks)</p>
+</li>
+<li>
+
+<p>The throughput that the topology is able to sustain should be relatively consistent.  If the throughput fluctuates greatly, increase back pressure using <a href="#topology.max.spout.pending"><tt>topology.max.spout.pending</tt></a>.</p>
+</li>
+</ul>
+<div class="section">
+<h3><a name="Parameters"></a>Parameters</h3>
+<p>The following parameters are useful for tuning the &#x201c;Unified&#x201d; Enrichment topology.</p>
+<p>WARNING: Some of the parameter names have been reused from the &#x201c;Split/Join&#x201d; topology so the name may not be appropriate. This will be corrected in the future.</p>
+<ul>
+
+<li><a href="#enrichment.workers"><tt>enrichment.workers</tt></a></li>
+<li><a href="#enrichment.acker.executors"><tt>enrichment.acker.executors</tt></a></li>
+<li><a href="#topology.worker.childopts"><tt>topology.worker.childopts</tt></a></li>
+<li><a href="#topology.max.spout.pending"><tt>topology.max.spout.pending</tt></a></li>
+<li><a href="#kafka.spout.parallelism"><tt>kafka.spout.parallelism</tt></a></li>
+<li><a href="#enrichment.join.parallelism"><tt>enrichment.join.parallelism</tt></a></li>
+<li><a href="#threat.intel.join.parallelism"><tt>threat.intel.join.parallelism</tt></a></li>
+<li><a href="#kafka.writer.parallelism"><tt>kafka.writer.parallelism</tt></a></li>
+<li><a href="#enrichment.join.cache.size"><tt>enrichment.join.cache.size</tt></a></li>
+<li><a href="#threat.intel.join.cache.size"><tt>threat.intel.join.cache.size</tt></a></li>
+<li><a href="#metron.threadpool.size"><tt>metron.threadpool.size</tt></a></li>
+<li><a href="#metron.threadpool.type"><tt>metron.threadpool.type</tt></a></li>
+</ul>
+<div class="section">
+<h4><a name="enrichment.workers"></a><tt>enrichment.workers</tt></h4>
+<p>The number of worker processes for the enrichment topology.</p>
+<ul>
+
+<li>
+
+<p>Start by tuning only a single worker.  Maximize throughput for that worker, then increase the number of workers.</p>
+</li>
+<li>
+
+<p>The throughput should scale relatively linearly as workers are added.  This reaches a limit as the number of workers running on a single node saturate the resources available.  When this happens, adding workers, but on additional nodes should allow further scaling.</p>
+</li>
+<li>
+
+<p>Increase parallelism before attempting to increase the number of workers.</p>
+</li>
+</ul></div>
+<div class="section">
+<h4><a name="enrichment.acker.executors"></a><tt>enrichment.acker.executors</tt></h4>
+<p>The number of ackers within the topology.</p>
+<ul>
+
+<li>
+
+<p>This should most often be equal to the number of workers defined in <tt>enrichment.workers</tt>.</p>
+</li>
+<li>
+
+<p>Within the Storm UI, click the &#x201c;Show System Stats&#x201d; button.  This will display a bolt named <tt>__acker</tt>.  If the capacity of this bolt is too high, then increase the number of ackers.</p>
+</li>
+</ul></div>
+<div class="section">
+<h4><a name="topology.worker.childopts"></a><tt>topology.worker.childopts</tt></h4>
+<p>This parameter accepts arguments that will be passed to the JVM created for each Storm worker.  This allows for control over the heap size, garbage collection, and any other JVM-specific parameter.</p>
+<ul>
+
+<li>
+
+<p>Start with a 2G heap and increase as needed.  Running with 8G was found to be beneficial, but will vary depending on caching needs.</p>
+<p><tt>-Xms8g -Xmx8g</tt></p>
+</li>
+<li>
+
+<p>The Garbage First Garbage Collector (G1GC) is recommended along with a cap on the amount of time spent in garbage collection.  This is intended to help address small object allocation issues due to our extensive use of caches.</p>
+<p><tt>-XX:+UseG1GC -XX:MaxGCPauseMillis=100</tt></p>
+</li>
+<li>
+
+<p>If the caches in use are very large (as defined by either <a href="#enrichment.join.cache.size"><tt>enrichment.join.cache.size</tt></a> or <a href="#threat.intel.join.cache.size"><tt>threat.intel.join.cache.size</tt></a>) and performance is poor, turning on garbage collection logging might be helpful.</p>
+</li>
+</ul></div>
+<div class="section">
+<h4><a name="topology.max.spout.pending"></a><tt>topology.max.spout.pending</tt></h4>
+<p>This limits the number of unacked tuples that the spout can introduce into the topology.</p>
+<ul>
+
+<li>
+
+<p>Decreasing this value will increase back pressure and allow the topology to consume messages at a pace that is maintainable.</p>
+</li>
+<li>
+
+<p>If the spout throws &#x2018;Commit Failed Exceptions&#x2019; then the topology is not keeping up.  Decreasing this value is one way to ensure that messages can be processed before they time out.</p>
+</li>
+<li>
+
+<p>If the topology&#x2019;s throughput is unsteady and inconsistent, decrease this value.  This should help the topology consume messages at a maintainable pace.</p>
+</li>
+<li>
+
+<p>If the bolt capacity is low, the topology can handle additional load.  Increase this value so that more tuples are introduced into the topology which should increase the bolt capacity.</p>
+</li>
+</ul></div>
+<div class="section">
+<h4><a name="kafka.spout.parallelism"></a><tt>kafka.spout.parallelism</tt></h4>
+<p>The parallelism of the Kafka spout within the topology.  Defines the maximum number of executors for each worker dedicated to running the spout.</p>
+<ul>
+
+<li>
+
+<p>The spout parallelism should most often be set to the number of partitions of the input Kafka topic.</p>
+</li>
+<li>
+
+<p>If the enrichment bolt capacity is low, increasing the parallelism of the spout can introduce additional load on the topology.</p>
+</li>
+</ul></div>
+<div class="section">
+<h4><a name="enrichment.join.parallelism"></a><tt>enrichment.join.parallelism</tt></h4>
+<p>The parallelism hint for the enrichment bolt.  Defines the maximum number of executors within each worker dedicated to running the enrichment bolt.</p>
+<p>WARNING: The property name does not match its current usage in the Unified topology.  This property name may change in the near future as it has been reused from the Split-Join topology.</p>
+<ul>
+
+<li>
+
+<p>If the capacity of the enrichment bolt is high, increasing the parallelism will introduce additional executors to bring the bolt capacity down.</p>
+</li>
+<li>
+
+<p>If the throughput of the topology is too low, increase this value.  This allows additional tuples to be enriched in parallel.</p>
+</li>
+<li>
+
+<p>Increasing parallelism on the enrichment bolt will at some point put pressure on the downstream threat intel and output bolts.  As this value is increased, monitor the capacity of the downstream bolts to ensure that they do not become a bottleneck.</p>
+</li>
+</ul></div>
+<div class="section">
+<h4><a name="threat.intel.join.parallelism"></a><tt>threat.intel.join.parallelism</tt></h4>
+<p>The parallelism hint for the threat intel bolt.  Defines the maximum number of executors within each worker dedicated to running the threat intel bolt.</p>
+<p>WARNING: The property name does not match its current usage in the Unified topology.  This property name may change in the near future as it has been reused from the Split-Join topology.</p>
+<ul>
+
+<li>
+
+<p>If the capacity of the threat intel bolt is high, increasing the parallelism will introduce additional executors to bring the bolt capacity down.</p>
+</li>
+<li>
+
+<p>If the throughput of the topology is too low, increase this value.  This allows additional tuples to be enriched in parallel.</p>
+</li>
+<li>
+
+<p>Increasing parallelism on this bolt will at some point put pressure on the downstream output bolt.  As this value is increased, monitor the capacity of the output bolt to ensure that it does not become a bottleneck.</p>
+</li>
+</ul></div>
+<div class="section">
+<h4><a name="kafka.writer.parallelism"></a><tt>kafka.writer.parallelism</tt></h4>
+<p>The parallelism hint for the output bolt which writes to the output Kafka topic.  Defines the maximum number of executors within each worker dedicated to running the output bolt.</p>
+<ul>
+
+<li>If the capacity of the output bolt is high, increasing the parallelism will introduce additional executors to bring the bolt capacity down.</li>
+</ul></div>
+<div class="section">
+<h4><a name="enrichment.join.cache.size"></a><tt>enrichment.join.cache.size</tt></h4>
+<p>The Enrichment bolt maintains a cache so that if the same enrichment occurs repetitively, the value can be retrieved from the cache instead of it being recomputed.</p>
+<p>There is a great deal of repetition in network telemetry, which leads to a great deal of repetition for the enrichments that operate on that telemetry.  Having a highly performant cache is one of the most critical factors driving performance.</p>
+<p>WARNING: The property name does not match its current usage in the Unified topology.  This property name may change in the near future as it has been reused from the Split-Join topology.</p>
+<ul>
+
+<li>
+
+<p>Increase the size of the cache to improve the rate of cache hits.</p>
+</li>
+<li>
+
+<p>Increasing the size of the cache may require that you increase the worker heap size using `topology.worker.childopts&#x2019;.</p>
+</li>
+</ul></div>
+<div class="section">
+<h4><a name="threat.intel.join.cache.size"></a><tt>threat.intel.join.cache.size</tt></h4>
+<p>The Threat Intel bolt maintains a cache so that if the same enrichment occurs repetitively, the value can be retrieved from the cache instead of it being recomputed.</p>
+<p>There is a great deal of repetition in network telemetry, which leads to a great deal of repetition for the enrichments that operate on that telemetry.  Having a highly performant cache is one of the most critical factors driving performance.</p>
+<p>WARNING: The property name does not match its current usage in the Unified topology.  This property name may change in the near future as it has been reused from the Split-Join topology.</p>
+<ul>
+
+<li>
+
+<p>Increase the size of the cache to improve the rate of cache hits.</p>
+</li>
+<li>
+
+<p>Increasing the size of the cache may require that you increase the worker heap size using `topology.worker.childopts&#x2019;.</p>
+</li>
+</ul></div>
+<div class="section">
+<h4><a name="metron.threadpool.size"></a><tt>metron.threadpool.size</tt></h4>
+<p>This value defines the number of threads maintained within a pool to execute each enrichment.  This value can either be a fixed number or it can be a multiple of the number of cores (5C = 5 times the number of cores).</p>
+<p>The enrichment bolt maintains a static thread pool that is used to execute each enrichment.  This thread pool is shared by all of the executors running within the same worker.</p>
+<p>WARNING: This value must be manually defined within the flux file at <tt>$METRON_HOME/flux/enrichment/remote-unified.yaml</tt>.  This value cannot be altered within Ambari at this time.</p>
+<ul>
+
+<li>
+
+<p>Start with a thread pool size of 1.  Adjust this value after tuning all other parameters first.  Only increase this value if testing shows performance improvements in your environment given your workload.</p>
+</li>
+<li>
+
+<p>If the thread pool size is too large this will cause the work to be shuffled amongst multiple CPU cores, which significantly decreases performance.  Using a smaller thread pool helps pin work to a single core.</p>
+</li>
+<li>
+
+<p>If the thread pool size is too small this can negatively impact IO-intensive workloads.  Increasing the thread pool size, helps when using IO-intensive workloads with a significant cache miss rate.   A thread pool size of 3-5 can help in these cases.</p>
+</li>
+<li>
+
+<p>Most workloads will make significant use of the cache and so 1-2 threads will most likely be optimal.</p>
+</li>
+<li>
+
+<p>The bolt uses a static thread pool.  To scale out, but keep the work mostly pinned to a CPU core, add more Storm workers while keeping the thread pool size low.</p>
+</li>
+<li>
+
+<p>If a larger thread pool increases load on the system, but decreases the throughput, then it is likely that the system is thrashing.  In this case the thread pool size should be decreased.</p>
+</li>
+</ul></div>
+<div class="section">
+<h4><a name="metron.threadpool.type"></a><tt>metron.threadpool.type</tt></h4>
+<p>The enrichment bolt maintains a static thread pool that is used to execute each enrichment.  This thread pool is shared by all of the executors running within the same worker.</p>
+<p>Defines the type of thread pool used.  This value can be either &#x201c;FIXED&#x201d; or &#x201c;WORK_STEALING&#x201d;.</p>
+<p>Currently, this value must be manually defined within the flux file at <tt>$METRON_HOME/flux/enrichment/remote-unified.yaml</tt>.  This value cannot be altered within Ambari.</p></div></div>
+<div class="section">
+<h3><a name="Benchmark_Results"></a>Benchmark Results</h3>
+<p>This section describes one execution of these benchmarks to help provide an understanding of what reasonably tuned parameters might look-like.</p>
+<p>These parameters and the throughput reported are highly dependent on the workload and resources available. The throughput is what was achievable given a reasonable amount of tuning on a small, dedicated cluster.  The throughput is largely dependent on the enrichments performed and the distribution of data within the incoming telemetry.</p>
+<p>The Enrichment topology has been show to scale relatively linearly.  Adding more resources allows for more complex enrichments, across more diverse data sets, at higher volumes.  The throughput that one might see in production largely depends on how much hardware can be committed to the task.</p>
+<div class="section">
+<h4><a name="Environment"></a>Environment</h4>
+<ul>
+
+<li>
+
+<p>Apache Metron 0.5.0 (pre-release) March, 2018</p>
+<ul>
+
+<li>This included <a class="externalLink" href="https://github.com/apache/metron/pull/947">a patch to the underlying caching mechanism</a> that greatly improves performance.</li>
+</ul>
+</li>
+<li>
+
+<p>Cisco UCS nodes</p>
+<ul>
+
+<li>32 core, 64-bit CPU (Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz)</li>
+<li>256 GB RAM</li>
+<li>x2 10G NIC bonded</li>
+<li>x4 6TB 7200 RPM disks</li>
+</ul>
+</li>
+<li>
+
+<p>Storm Supervisors are isolated and running on a dedicated set of 3 nodes.</p>
+</li>
+<li>
+
+<p>Kafka Brokers are isolated and running on a separate, dedicated set of 3 nodes.</p>
+</li>
+</ul></div>
+<div class="section">
+<h4><a name="Results"></a>Results</h4>
+<ul>
+
+<li>
+
+<p>These benchmarks executed all 3 enrichments simultaneously; the <a href="#Geo_IP_Enrichment">Geo IP Enrichment</a>, <a href="#Stellar_Enrichment">Stellar Enrichment</a> and the <a href="#HBase_Enrichment">HBase Enrichment</a>.</p>
+</li>
+<li>
+
+<p>The data used to drive the benchmark includes 10,000 unique IP addresses.  The telemetry was populated with IP addresses such that 10% of these IPs were chosen 80% of the time.  This bias was designed to mimic the typical distribution seen in real-world telemetry.</p>
+</li>
+<li>
+
+<p>The Unified Enrichment topology was able to sustain 308,000 events per second on a small, dedicated 3 node cluster.</p>
+</li>
+<li>
+
+<p>The values used to achieve these results with the Unified Enrichment topology follows.  You should not attempt to use these parameters in your topology directly.  These are specific to the environment and workload and should only be used as a guideline.</p>
+
+<div>
+<div>
+<pre class="source">enrichment.workers=9
+enrichment.acker.executors=9
+enrichment.join.cache.size=100000
+threat.intel.join.cache.size=100000
+kafka.spout.parallelism=27
+enrichment.join.parallelism=54
+threat.intel.join.parallelism=9
+kafka.writer.parallelism=27
+topology.worker.childopts=-XX:+UseG1GC -Xms8g -Xmx8g -XX:MaxGCPauseMillis=100
+topology.max.spout.pending=3000
+metron.threadpool.size=1
+metron.threadpool.type=FIXED
+</pre></div></div>
+</li>
+</ul></div></div></div>
+        </div>
+      </div>
+    </div>
+    <hr/>
+    <footer>
+      <div class="container-fluid">
+        <div class="row-fluid">
+© 2015-2016 The Apache Software Foundation. Apache Metron, Metron, Apache, the Apache feather logo,
+            and the Apache Metron project logo are trademarks of The Apache Software Foundation.
+        </div>
+      </div>
+    </footer>
+  </body>
+</html>


Mime
View raw message