nifi-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jsto...@apache.org
Subject svn commit: r1811008 [24/43] - in /nifi/site/trunk/docs: ./ nifi-docs/ nifi-docs/components/ nifi-docs/components/org.apache.nifi/ nifi-docs/components/org.apache.nifi/nifi-ambari-nar/ nifi-docs/components/org.apache.nifi/nifi-ambari-nar/1.4.0/ nifi-do...
Date Tue, 03 Oct 2017 13:30:27 GMT
Added: nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.MergeRecord/additionalDetails.html
URL: http://svn.apache.org/viewvc/nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.MergeRecord/additionalDetails.html?rev=1811008&view=auto
==============================================================================
--- nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.MergeRecord/additionalDetails.html (added)
+++ nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.MergeRecord/additionalDetails.html Tue Oct  3 13:30:16 2017
@@ -0,0 +1,229 @@
+<!DOCTYPE html>
+<html lang="en">
+    <!--
+      Licensed to the Apache Software Foundation (ASF) under one or more
+      contributor license agreements.  See the NOTICE file distributed with
+      this work for additional information regarding copyright ownership.
+      The ASF licenses this file to You under the Apache License, Version 2.0
+      (the "License"); you may not use this file except in compliance with
+      the License.  You may obtain a copy of the License at
+          http://www.apache.org/licenses/LICENSE-2.0
+      Unless required by applicable law or agreed to in writing, software
+      distributed under the License is distributed on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+      See the License for the specific language governing permissions and
+      limitations under the License.
+    -->
+    <head>
+        <meta charset="utf-8" />
+        <title>MergeRecord</title>
+
+        <link rel="stylesheet" href="../../../../../css/component-usage.css" type="text/css" />
+    </head>
+
+    <body>
+
+        <h3>Introduction</h3>
+    	<p>
+    	    The MergeRecord Processor allows the user to take many FlowFiles that consist of record-oriented data (any data format for which there is
+    	    a Record Reader available) and combine the FlowFiles into one larger FlowFile. This may be preferable before pushing the data to a downstream
+    	    system that prefers larger batches of data, such as HDFS, or in order to improve performance of a NiFi flow by reducing the number of FlowFiles
+    	    that flow through the system (thereby reducing the contention placed on the FlowFile Repository, Provenance Repository, Content Repository, and
+    	    FlowFile Queues).
+    	</p>
+
+    	<p>
+    		The Processor creates several 'bins' to put the FlowFiles in. The maximum number of bins to use is set to 5 by default, but this can be changed
+    		by updating the value of the &lt;Maximum number of Bins&gt; property. The number of bins is bound in order to avoid running out of Java heap space.
+    		Note: while the contents of a FlowFile are stored in the Content Repository and not in the Java heap space, the Processor must hold the FlowFile
+    		objects themselves in memory. As a result, these FlowFiles with their attributes can potentially take up a great deal of heap space and cause
+    		OutOfMemoryError's to be thrown. In order to avoid this, if you expect to merge many small FlowFiles together, it is advisable to instead use a
+    		MergeContent that merges no more than say 1,000 FlowFiles into a bundle and then use a second MergeContent to merges these small bundles into
+    		larger bundles. For example, to merge 1,000,000 FlowFiles together, use MergeRecord that uses a &lt;Maximum Number of Records&gt; of 1,000 and route the
+    		"merged" Relationship to a second MergeRecord that also sets the &lt;Maximum Number of Records&gt; to 1,000. The second MergeRecord will then merge 1,000 bundles
+    		of 1,000, which in effect produces bundles of 1,000,000.
+    	</p>
+
+
+
+    	<h3>How FlowFiles are Binned</h3>
+    	<p>
+    	    How the Processor determines which bin to place a FlowFile in depends on a few different configuration options. Firstly, the Merge Strategy
+    	    is considered. The Merge Strategy can be set to one of two options: Bin Packing Algorithm, or Defragment. When the goal is to simply combine
+    	    smaller FlowFiles into one larger FlowFiles, the Bin Packing Algorithm should be used. This algorithm picks a bin based on whether or not the FlowFile
+    	    can fit in the bin according to its size and the &lt;Maximum Bin Size&gt; property and whether or not the FlowFile is 'like' the other FlowFiles in
+    	    the bin. What it means for two FlowFiles to be 'like FlowFiles' is discussed at the end of this section.
+    	</p>
+    	
+    	<p>
+    	    The "Defragment" Merge Strategy can be used when records need to be explicitly assigned to the same bin. For example, if data is split apart using
+    	    the SplitRecord Processor, each 'split' can be processed independently and later merged back together using this Processor with the
+    	    Merge Strategy set to Defragment. In order for FlowFiles to be added to the same bin when using this configuration, the FlowFiles must have the same
+    	    value for the "fragment.identifier" attribute. Each FlowFile with the same identifier must also have the same value for the "fragment.count" attribute
+    	    (which indicates how many FlowFiles belong in the bin) and a unique value for the "fragment.index" attribute so that the FlowFiles can be ordered
+    	    correctly.
+    	</p>
+    	
+    	<p>
+    	    In order to be added to the same bin, two FlowFiles must be 'like FlowFiles.' In order for two FlowFiles to be like FlowFiles, they must have the same
+    	    schema, and if the &lt;Correlation Attribute Name&gt; property is set, they must have the same value for the specified attribute. For example, if the
+    	    &lt;Correlation Attribute Name&gt; is set to "filename" then two FlowFiles must have the same value for the "filename" attribute in order to be binned
+    	    together. If more than one attribute is needed in order to correlate two FlowFiles, it is recommended to use an UpdateAttribute processor before the
+    	    MergeRecord processor and combine the attributes. For example, if the goal is to bin together two FlowFiles only if they have the same value for the
+    	    "abc" attribute and the "xyz" attribute, then we could accomplish this by using UpdateAttribute and adding a property with name "correlation.attribute"
+    	    and a value of "abc=${abc},xyz=${xyz}" and then setting MergeRecord's &lt;Correlation Attribute Name&gt; property to "correlation.attribute".
+    	</p>
+    	
+    	<p>
+    		It is often useful to bin together only Records that have the same value for some field. For example, if we have point-of-sale data, perhaps the desire
+    		is to bin together records that belong to the same store, as identified by the 'storeId' field. This can be accomplished by making use of the PartitionRecord
+    		Processor ahead of MergeRecord. This Processor will allow one or more fields to be configured as the partitioning criteria and will create attributes for those
+    		corresponding values. An UpdateAttribute processor could then be used, if necessary, to combine multiple attributes into a single correlation attribute,
+    		as described above. See documentation for those processors for more details.
+    	</p>
+
+
+
+		<h3>When a Bin is Merged</h3>    	
+    	<p>
+    	    Above, we discussed how a bin is chosen for a given FlowFile. Once a bin has been created and FlowFiles added to it, we must have some way to determine
+    	    when a bin is "full" so that we can bin those FlowFiles together into a "merged" FlowFile. There are a few criteria that are used to make a determination as
+    	    to whether or not a bin should be merged.
+    	</p>
+
+		<p>
+		    If the &lt;Merge Strategy&gt; property is set to "Bin Packing Algorithm" then then the following rules will be evaluated.
+		    Firstly, in order for a bin to be full, both of the thresholds specified by the &lt;Minimum Bin Size&gt; and the &lt;Minimum Number of Records&gt; properties
+		    must be satisfied. If one of these properties is not set, then it is ignored. Secondly, if either the &lt;Maximum Bin Size&gt; or the &lt;Maximum Number of
+		    Records&gt; property is reached, then the bin is merged. That is, both of the minimum values must be reached but only one of the maximum values need be reached.
+		    Note that the &lt;Maximum Number of Records&gt; property is a "soft limit," meaning that all records in a given input FlowFile will be added to the same bin, and
+		    as a result the number of records may exceed the maximum configured number of records. Once this happens, though, no more Records will be added to that same bin
+		    from another FlowFile.
+		    If the &lt;Max Bin Age&gt; is reached for a bin, then the FlowFiles in that bin will be merged, <b>even if</b> the minimum bin size and minimum number of records
+		    have not yet been met. Finally, if the maximum number of bins have been created (as specified by the &lt;Maximum number of Bins&gt; property), and some input FlowFiles
+		    cannot fit into any of the existing bins, then the oldest bin will be merged to make room. This is done because otherwise we would not be able to add any
+		    additional FlowFiles to the existing bins and would have to wait until the Max Bin Age is reached (if ever) in order to merge any FlowFiles.
+		</p>
+
+        <p>
+            If the &lt;Merge Strategy&gt; property is set to "Defragment" then a bin is full only when the number of FlowFiles in the bin is equal to the number specified
+            by the "fragment.count" attribute of one of the FlowFiles in the bin. All FlowFiles that have this attribute must have the same value for this attribute,
+            or else they will be routed to the "failure" relationship. It is not necessary that all FlowFiles have this value, but at least one FlowFile in the bin must have
+            this value or the bin will never be complete. If all of the necessary FlowFiles are not binned together by the point at which the bin times amount
+            (as specified by the &lt;Max Bin Age&gt; property), then the FlowFiles will all be routed to the 'failure' relationship instead of being merged together.
+        </p>
+
+        <p>
+            Once a bin is merged into a single FlowFile, it can sometimes be useful to understand why exactly the bin was merged when it was. For example, if the maximum number
+            of allowable bins is reached, a merged FlowFile may consist of far fewer records than expected. In order to help understand the behavior, the Processor will emit
+            a JOIN Provenance Events when creating the merged FlowFile, and the JOIN event will include in it a "Details" field that explains why the bin was merged when it was.
+            For example, the event will indicate "Records Merged due to: Bin is full" if the bin reached its minimum thresholds and no more subsequent FlowFiles were able to be
+            added to it. Or it may indicate "Records Merged due to: Maximum number of bins has been exceeded" if the bin was merged due to the configured maximum number of bins
+            being filled and needing to free up space for a new bin.
+        </p>
+
+
+    	<h3>When a Failure Occurs</h3>
+    	<p>
+    	    When a bin is filled, the Processor is responsible for merging together all of the records in those FlowFiles into a single FlowFile. If the Processor fails
+    	    to do so for any reason (for example, a Record cannot be read from an input FlowFile), then all of the FlowFiles in that bin are routed to the 'failure'
+    	    Relationship. The Processor does not skip the single problematic FlowFile and merge the others. This behavior was chosen because of two different considerations.
+    	    Firstly, without those problematic records, the bin may not truly be full, as the minimum bin size may not be reached without those records.
+    	    Secondly, and more importantly, if the problematic FlowFile contains 100 "good" records before the problematic ones, those 100 records would already have been
+    	    written to the "merged" FlowFile. We cannot un-write those records. If we were to then send those 100 records on and route the problematic FlowFile to 'failure'
+    	    then in a situation where the "failure" relationship is eventually routed back to MergeRecord, we could end up continually duplicating those 100 successfully
+    	    processed records.
+    	</p>
+    	
+    	
+    	
+    	<h2>Examples</h2>
+    	
+    	<p>
+    		To better understand how this Processor works, we will lay out a few examples. For the sake of simplicity of these examples, we will use CSV-formatted data and
+    		write the merged data as CSV-formatted data, but
+    		the format of the data is not really relevant, as long as there is a Record Reader that is capable of reading the data and a Record Writer capable of writing
+    		the data in the desired format.
+    	</p>
+
+
+
+    	<h3>Example 1 - Batching Together Many Small FlowFiles</h3>
+    	
+    	<p>
+    		When we want to batch together many small FlowFiles in order to create one larger FlowFile, we will accomplish this by using the "Bin Packing Algorithm"
+    		Merge Strategy. The idea here is to bundle together as many FlowFiles as we can within our minimum and maximum number of records and bin size.
+    		Consider that we have the following properties set:
+    	</p>
+
+<table>
+  <tr>
+    <th>Property Name</th>
+    <th>Property Value</th>
+  </tr>
+  <tr>
+    <td>Merge Strategy</td>
+    <td>Bin Packing Algorithm</td>
+  </tr>
+  <tr>
+    <td>Minimum Number of Records</td>
+    <td>3</td>
+  </tr>
+  <tr>
+    <td>Maximum Number of Records</td>
+    <td>5</td>
+  </tr>
+</table>
+
+        <p>
+            Also consider that we have the following data on the queue, with the schema indicating a Name and an Age field:
+        </p>
+
+<table>
+  <tr>
+    <th>FlowFile ID</th>
+    <th>FlowFile Contents</th>
+  </tr>
+  <tr>
+    <td>1</td>
+    <td>Mark, 33</td>
+  </tr>
+  <tr>
+    <td>2</td>
+    <td>John, 45<br />Jane, 43</td>
+  </tr>
+  <tr>
+    <td>3</td>
+    <td>Jake, 3</td>
+  </tr>
+  <tr>
+    <td>4</td>
+    <td>Jan, 2</td>
+  </tr>
+</table>
+
+		<p>
+			In this, because we have not configured a Correlation Attribute, and because all FlowFiles have the same schema, the Processor
+			will attempt to add all of these FlowFiles to the same bin. Because the Minimum Number of Records is 3 and the Maximum Number of Records is 5,
+			all of the FlowFiles will be added to the same bin. The output, then, is a single FlowFile with the following content:
+		</p>
+
+<code>
+<pre>
+Mark, 33
+John, 45
+Jane, 43
+Jake, 3
+Jan, 2
+</pre>
+</code>
+
+		<p>
+		   When the Processor runs, it will bin all of the FlowFiles that it can get from the queue. After that, it will merge any bin that is "full enough."
+		   So if we had only 3 FlowFiles on the queue, those 3 would have been added, and a new bin would have been created in the next iteration, once the
+		   4th FlowFile showed up. However, if we had 8 FlowFiles queued up, only 5 would have been added to the first bin. The other 3 would have been added
+		   to a second bin, and that bin would then be merged since it reached the minimum threshold of 3 also.
+		</p>
+
+	</body>
+</html>
\ No newline at end of file

Added: nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.MergeRecord/index.html
URL: http://svn.apache.org/viewvc/nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.MergeRecord/index.html?rev=1811008&view=auto
==============================================================================
--- nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.MergeRecord/index.html (added)
+++ nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.MergeRecord/index.html Tue Oct  3 13:30:16 2017
@@ -0,0 +1 @@
+<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"></meta><title>MergeRecord</title><link rel="stylesheet" href="../../../../../css/component-usage.css" type="text/css"></link></head><script type="text/javascript">window.onload = function(){if(self==top) { document.getElementById('nameHeader').style.display = "inherit"; } }</script><body><h1 id="nameHeader" style="display: none;">MergeRecord</h1><h2>Description: </h2><p>This Processor merges together multiple record-oriented FlowFiles into a single FlowFile that contains all of the Records of the input FlowFiles. This Processor works by creating 'bins' and then adding FlowFiles to these bins until they are full. Once a bin is full, all of the FlowFiles will be combined into a single output FlowFile, and that FlowFile will be routed to the 'merged' Relationship. A bin will consist of potentially many 'like FlowFiles'. In order for two FlowFiles to be considered 'like FlowFiles', they must have the same Schema (as identified b
 y the Record Reader) and, if the &lt;Correlation Attribute Name&gt; property is set, the same value for the specified attribute. See Processor Usage and Additional Details for more information.</p><p><a href="additionalDetails.html">Additional Details...</a></p><h3>Tags: </h3><p>merge, record, content, correlation, stream, event</p><h3>Properties: </h3><p>In the list below, the names of required properties appear in <strong>bold</strong>. Any other properties (not in bold) are considered optional. The table also indicates any default values.</p><table id="properties"><tr><th>Name</th><th>Default Value</th><th>Allowable Values</th><th>Description</th></tr><tr><td id="name"><strong>Record Reader</strong></td><td id="default-value"></td><td id="allowable-values"><strong>Controller Service API: </strong><br/>RecordReaderFactory<br/><strong>Implementations: </strong><a href="../../../nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.csv.CSVReader/index.html">CSVReader</a><br/>
 <a href="../../../nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.grok.GrokReader/index.html">GrokReader</a><br/><a href="../../../nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.avro.AvroReader/index.html">AvroReader</a><br/><a href="../../../nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.json.JsonTreeReader/index.html">JsonTreeReader</a><br/><a href="../../../nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.json.JsonPathReader/index.html">JsonPathReader</a><br/><a href="../../../nifi-scripting-nar/1.4.0/org.apache.nifi.record.script.ScriptedReader/index.html">ScriptedReader</a></td><td id="description">Specifies the Controller Service to use for reading incoming data</td></tr><tr><td id="name"><strong>Record Writer</strong></td><td id="default-value"></td><td id="allowable-values"><strong>Controller Service API: </strong><br/>RecordSetWriterFactory<br/><strong>Implementations: </strong><a href="../../../nifi-record-serialization-
 services-nar/1.4.0/org.apache.nifi.json.JsonRecordSetWriter/index.html">JsonRecordSetWriter</a><br/><a href="../../../nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.text.FreeFormTextRecordSetWriter/index.html">FreeFormTextRecordSetWriter</a><br/><a href="../../../nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.avro.AvroRecordSetWriter/index.html">AvroRecordSetWriter</a><br/><a href="../../../nifi-scripting-nar/1.4.0/org.apache.nifi.record.script.ScriptedRecordSetWriter/index.html">ScriptedRecordSetWriter</a><br/><a href="../../../nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.csv.CSVRecordSetWriter/index.html">CSVRecordSetWriter</a></td><td id="description">Specifies the Controller Service to use for writing out the records</td></tr><tr><td id="name"><strong>Merge Strategy</strong></td><td id="default-value">Bin-Packing Algorithm</td><td id="allowable-values"><ul><li>Bin-Packing Algorithm <img src="../../../../../html/images/iconInfo.png" alt
 ="Generates 'bins' of FlowFiles and fills each bin as full as possible. FlowFiles are placed into a bin based on their size and optionally their attributes (if the &lt;Correlation Attribute&gt; property is set)" title="Generates 'bins' of FlowFiles and fills each bin as full as possible. FlowFiles are placed into a bin based on their size and optionally their attributes (if the &lt;Correlation Attribute&gt; property is set)"></img></li><li>Defragment <img src="../../../../../html/images/iconInfo.png" alt="Combines fragments that are associated by attributes back into a single cohesive FlowFile. If using this strategy, all FlowFiles must have the attributes &lt;fragment.identifier&gt; and &lt;fragment.count&gt;. All FlowFiles with the same value for &quot;fragment.identifier&quot; will be grouped together. All FlowFiles in this group must have the same value for the &quot;fragment.count&quot; attribute. The ordering of the Records that are output is not guaranteed." title="Combines f
 ragments that are associated by attributes back into a single cohesive FlowFile. If using this strategy, all FlowFiles must have the attributes &lt;fragment.identifier&gt; and &lt;fragment.count&gt;. All FlowFiles with the same value for &quot;fragment.identifier&quot; will be grouped together. All FlowFiles in this group must have the same value for the &quot;fragment.count&quot; attribute. The ordering of the Records that are output is not guaranteed."></img></li></ul></td><td id="description">Specifies the algorithm used to merge records. The 'Defragment' algorithm combines fragments that are associated by attributes back into a single cohesive FlowFile. The 'Bin-Packing Algorithm' generates a FlowFile populated by arbitrarily chosen FlowFiles</td></tr><tr><td id="name">Correlation Attribute Name</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">If specified, two FlowFiles will be binned together only if they have the same value for this Attribut
 e. If not specified, FlowFiles are bundled by the order in which they are pulled from the queue.</td></tr><tr><td id="name"><strong>Attribute Strategy</strong></td><td id="default-value">Keep Only Common Attributes</td><td id="allowable-values"><ul><li>Keep Only Common Attributes <img src="../../../../../html/images/iconInfo.png" alt="Any attribute that is not the same on all FlowFiles in a bin will be dropped. Those that are the same across all FlowFiles will be retained." title="Any attribute that is not the same on all FlowFiles in a bin will be dropped. Those that are the same across all FlowFiles will be retained."></img></li><li>Keep All Unique Attributes <img src="../../../../../html/images/iconInfo.png" alt="Any attribute that has the same value for all FlowFiles in a bin, or has no value for a FlowFile, will be kept. For example, if a bin consists of 3 FlowFiles and 2 of them have a value of 'hello' for the 'greeting' attribute and the third FlowFile has no 'greeting' attri
 bute then the outbound FlowFile will get a 'greeting' attribute with the value 'hello'." title="Any attribute that has the same value for all FlowFiles in a bin, or has no value for a FlowFile, will be kept. For example, if a bin consists of 3 FlowFiles and 2 of them have a value of 'hello' for the 'greeting' attribute and the third FlowFile has no 'greeting' attribute then the outbound FlowFile will get a 'greeting' attribute with the value 'hello'."></img></li></ul></td><td id="description">Determines which FlowFile attributes should be added to the bundle. If 'Keep All Unique Attributes' is selected, any attribute on any FlowFile that gets bundled will be kept unless its value conflicts with the value from another FlowFile. If 'Keep Only Common Attributes' is selected, only the attributes that exist on all FlowFiles in the bundle, with the same value, will be preserved.</td></tr><tr><td id="name"><strong>Minimum Number of Records</strong></td><td id="default-value">1</td><td id="
 allowable-values"></td><td id="description">The minimum number of records to include in a bin</td></tr><tr><td id="name">Maximum Number of Records</td><td id="default-value">1000</td><td id="allowable-values"></td><td id="description">The maximum number of Records to include in a bin. This is a 'soft limit' in that if a FlowFIle is added to a bin, all records in that FlowFile will be added, so this limit may be exceeded by up to the number of records in the last input FlowFile.</td></tr><tr><td id="name"><strong>Minimum Bin Size</strong></td><td id="default-value">0 B</td><td id="allowable-values"></td><td id="description">The minimum size of for the bin</td></tr><tr><td id="name">Maximum Bin Size</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">The maximum size for the bundle. If not specified, there is no maximum. This is a 'soft limit' in that if a FlowFile is added to a bin, all records in that FlowFile will be added, so this limit may be excee
 ded by up to the number of bytes in last input FlowFile.</td></tr><tr><td id="name">Max Bin Age</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">The maximum age of a Bin that will trigger a Bin to be complete. Expected format is &lt;duration&gt; &lt;time unit&gt; where &lt;duration&gt; is a positive integer and time unit is one of seconds, minutes, hours</td></tr><tr><td id="name"><strong>Maximum Number of Bins</strong></td><td id="default-value">10</td><td id="allowable-values"></td><td id="description">Specifies the maximum number of bins that can be held in memory at any one time. This number should not be smaller than the maximum number of conurrent threads for this Processor, or the bins that are created will often consist only of a single incoming FlowFile.</td></tr></table><h3>Relationships: </h3><table id="relationships"><tr><th>Name</th><th>Description</th></tr><tr><td>failure</td><td>If the bundle cannot be created, all FlowFiles that wou
 ld have been used to created the bundle will be transferred to failure</td></tr><tr><td>original</td><td>The FlowFiles that were used to create the bundle</td></tr><tr><td>merged</td><td>The FlowFile containing the merged records</td></tr></table><h3>Reads Attributes: </h3><table id="reads-attributes"><tr><th>Name</th><th>Description</th></tr><tr><td>fragment.identifier</td><td>Applicable only if the &lt;Merge Strategy&gt; property is set to Defragment. All FlowFiles with the same value for this attribute will be bundled together.</td></tr><tr><td>fragment.count</td><td>Applicable only if the &lt;Merge Strategy&gt; property is set to Defragment. This attribute must be present on all FlowFiles with the same value for the fragment.identifier attribute. All FlowFiles in the same bundle must have the same value for this attribute. The value of this attribute indicates how many FlowFiles should be expected in the given bundle.</td></tr></table><h3>Writes Attributes: </h3><table id="write
 s-attributes"><tr><th>Name</th><th>Description</th></tr><tr><td>record.count</td><td>The merged FlowFile will have a 'record.count' attribute indicating the number of records that were written to the FlowFile.</td></tr><tr><td>mime.type</td><td>The MIME Type indicated by the Record Writer</td></tr><tr><td>merge.count</td><td>The number of FlowFiles that were merged into this bundle</td></tr><tr><td>merge.bin.age</td><td>The age of the bin, in milliseconds, when it was merged and output. Effectively this is the greatest amount of time that any FlowFile in this bundle remained waiting in this processor before it was output</td></tr><tr><td>&lt;Attributes from Record Writer&gt;</td><td>Any Attribute that the configured Record Writer returns will be added to the FlowFile.</td></tr></table><h3>State management: </h3>This component does not store state.<h3>Restricted: </h3>This component is not restricted.<h3>Input requirement: </h3>This component requires an incoming relationship.<h3>See
  Also:</h3><p><a href="../org.apache.nifi.processors.standard.MergeContent/index.html">MergeContent</a>, <a href="../org.apache.nifi.processors.standard.SplitRecord/index.html">SplitRecord</a>, <a href="../org.apache.nifi.processors.standard.PartitionRecord/index.html">PartitionRecord</a></p></body></html>
\ No newline at end of file

Added: nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.ModifyBytes/index.html
URL: http://svn.apache.org/viewvc/nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.ModifyBytes/index.html?rev=1811008&view=auto
==============================================================================
--- nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.ModifyBytes/index.html (added)
+++ nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.ModifyBytes/index.html Tue Oct  3 13:30:16 2017
@@ -0,0 +1 @@
+<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"></meta><title>ModifyBytes</title><link rel="stylesheet" href="../../../../../css/component-usage.css" type="text/css"></link></head><script type="text/javascript">window.onload = function(){if(self==top) { document.getElementById('nameHeader').style.display = "inherit"; } }</script><body><h1 id="nameHeader" style="display: none;">ModifyBytes</h1><h2>Description: </h2><p>Discard byte range at the start and end or all content of a binary file.</p><h3>Tags: </h3><p>binary, discard, keep</p><h3>Properties: </h3><p>In the list below, the names of required properties appear in <strong>bold</strong>. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the <a href="../../../../../html/expression-language-guide.html">NiFi Expression Language</a>.</p><table id="properties"><tr><th>Name</th><th>Default Value</th><th>Allowable Values</th><th>Description
 </th></tr><tr><td id="name"><strong>Start Offset</strong></td><td id="default-value">0 B</td><td id="allowable-values"></td><td id="description">Number of bytes removed at the beginning of the file.<br/><strong>Supports Expression Language: true</strong></td></tr><tr><td id="name"><strong>End Offset</strong></td><td id="default-value">0 B</td><td id="allowable-values"></td><td id="description">Number of bytes removed at the end of the file.<br/><strong>Supports Expression Language: true</strong></td></tr><tr><td id="name"><strong>Remove All Content</strong></td><td id="default-value">false</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">Remove all content from the FlowFile superseding Start Offset and End Offset properties.</td></tr></table><h3>Relationships: </h3><table id="relationships"><tr><th>Name</th><th>Description</th></tr><tr><td>success</td><td>Processed flowfiles.</td></tr></table><h3>Reads Attributes: </h3>None specified.<h3>Wr
 ites Attributes: </h3>None specified.<h3>State management: </h3>This component does not store state.<h3>Restricted: </h3>This component is not restricted.<h3>Input requirement: </h3>This component requires an incoming relationship.</body></html>
\ No newline at end of file

Added: nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.MonitorActivity/index.html
URL: http://svn.apache.org/viewvc/nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.MonitorActivity/index.html?rev=1811008&view=auto
==============================================================================
--- nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.MonitorActivity/index.html (added)
+++ nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.MonitorActivity/index.html Tue Oct  3 13:30:16 2017
@@ -0,0 +1 @@
+<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"></meta><title>MonitorActivity</title><link rel="stylesheet" href="../../../../../css/component-usage.css" type="text/css"></link></head><script type="text/javascript">window.onload = function(){if(self==top) { document.getElementById('nameHeader').style.display = "inherit"; } }</script><body><h1 id="nameHeader" style="display: none;">MonitorActivity</h1><h2>Description: </h2><p>Monitors the flow for activity and sends out an indicator when the flow has not had any data for some specified amount of time and again when the flow's activity is restored</p><h3>Tags: </h3><p>monitor, flow, active, inactive, activity, detection</p><h3>Properties: </h3><p>In the list below, the names of required properties appear in <strong>bold</strong>. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the <a href="../../../../../html/expression-language-guide.h
 tml">NiFi Expression Language</a>.</p><table id="properties"><tr><th>Name</th><th>Default Value</th><th>Allowable Values</th><th>Description</th></tr><tr><td id="name"><strong>Threshold Duration</strong></td><td id="default-value">5 min</td><td id="allowable-values"></td><td id="description">Determines how much time must elapse before considering the flow to be inactive</td></tr><tr><td id="name"><strong>Continually Send Messages</strong></td><td id="default-value">false</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">If true, will send inactivity indicator continually every Threshold Duration amount of time until activity is restored; if false, will send an indicator only when the flow first becomes inactive</td></tr><tr><td id="name"><strong>Inactivity Message</strong></td><td id="default-value">Lacking activity as of time: ${now():format('yyyy/MM/dd HH:mm:ss')}; flow has been inactive for ${inactivityDurationMillis:toNumber():divide(600
 00)} minutes</td><td id="allowable-values"></td><td id="description">The message that will be the content of FlowFiles that are sent to the 'inactive' relationship<br/><strong>Supports Expression Language: true</strong></td></tr><tr><td id="name"><strong>Activity Restored Message</strong></td><td id="default-value">Activity restored at time: ${now():format('yyyy/MM/dd HH:mm:ss')} after being inactive for ${inactivityDurationMillis:toNumber():divide(60000)} minutes</td><td id="allowable-values"></td><td id="description">The message that will be the content of FlowFiles that are sent to 'activity.restored' relationship<br/><strong>Supports Expression Language: true</strong></td></tr><tr><td id="name">Copy Attributes</td><td id="default-value">false</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">If true, will copy all flow file attributes from the flow file that resumed activity to the newly created indicator flow file</td></tr><tr><td id="n
 ame"><strong>Monitoring Scope</strong></td><td id="default-value">node</td><td id="allowable-values"><ul><li>node</li><li>cluster</li></ul></td><td id="description">Specify how to determine activeness of the flow. 'node' means that activeness is examined at individual node separately. It can be useful if DFM expects each node should receive flow files in a distributed manner. With 'cluster', it defines the flow is active while at least one node receives flow files actively. If NiFi is running as standalone mode, this should be set as 'node', if it's 'cluster', NiFi logs a warning message and act as 'node' scope.</td></tr><tr><td id="name"><strong>Reporting Node</strong></td><td id="default-value">all</td><td id="allowable-values"><ul><li>all</li><li>primary</li></ul></td><td id="description">Specify which node should send notification flow-files to inactive and activity.restored relationships. With 'all', every node in this cluster send notification flow-files. 'primary' means flow-
 files will be sent only from a primary node. If NiFi is running as standalone mode, this should be set as 'all', even if it's 'primary', NiFi act as 'all'.</td></tr></table><h3>Relationships: </h3><table id="relationships"><tr><th>Name</th><th>Description</th></tr><tr><td>inactive</td><td>This relationship is used to transfer an Inactivity indicator when no FlowFiles are routed to 'success' for Threshold Duration amount of time</td></tr><tr><td>success</td><td>All incoming FlowFiles are routed to success</td></tr><tr><td>activity.restored</td><td>This relationship is used to transfer an Activity Restored indicator when FlowFiles are routing to 'success' following a period of inactivity</td></tr></table><h3>Reads Attributes: </h3>None specified.<h3>Writes Attributes: </h3><table id="writes-attributes"><tr><th>Name</th><th>Description</th></tr><tr><td>inactivityStartMillis</td><td>The time at which Inactivity began, in the form of milliseconds since Epoch</td></tr><tr><td>inactivityDu
 rationMillis</td><td>The number of milliseconds that the inactivity has spanned</td></tr></table><h3>State management: </h3><table id="stateful"><tr><th>Scope</th><th>Description</th></tr><tr><td>CLUSTER</td><td>MonitorActivity stores the last timestamp at each node as state, so that it can examine activity at cluster wide.If 'Copy Attribute' is set to true, then flow file attributes are also persisted.</td></tr></table><h3>Restricted: </h3>This component is not restricted.<h3>Input requirement: </h3>This component requires an incoming relationship.</body></html>
\ No newline at end of file

Added: nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.Notify/index.html
URL: http://svn.apache.org/viewvc/nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.Notify/index.html?rev=1811008&view=auto
==============================================================================
--- nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.Notify/index.html (added)
+++ nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.Notify/index.html Tue Oct  3 13:30:16 2017
@@ -0,0 +1 @@
+<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"></meta><title>Notify</title><link rel="stylesheet" href="../../../../../css/component-usage.css" type="text/css"></link></head><script type="text/javascript">window.onload = function(){if(self==top) { document.getElementById('nameHeader').style.display = "inherit"; } }</script><body><h1 id="nameHeader" style="display: none;">Notify</h1><h2>Description: </h2><p>Caches a release signal identifier in the distributed cache, optionally along with the FlowFile's attributes.  Any flow files held at a corresponding Wait processor will be released once this signal in the cache is discovered.</p><h3>Tags: </h3><p>map, cache, notify, distributed, signal, release</p><h3>Properties: </h3><p>In the list below, the names of required properties appear in <strong>bold</strong>. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the <a href="../../../../../h
 tml/expression-language-guide.html">NiFi Expression Language</a>.</p><table id="properties"><tr><th>Name</th><th>Default Value</th><th>Allowable Values</th><th>Description</th></tr><tr><td id="name"><strong>Release Signal Identifier</strong></td><td id="default-value"></td><td id="allowable-values"></td><td id="description">A value, or the results of an Attribute Expression Language statement, which will be evaluated against a FlowFile in order to determine the release signal cache key<br/><strong>Supports Expression Language: true</strong></td></tr><tr><td id="name"><strong>Signal Counter Name</strong></td><td id="default-value">default</td><td id="allowable-values"></td><td id="description">A value, or the results of an Attribute Expression Language statement, which will be evaluated against a FlowFile in order to determine the signal counter name. Signal counter name is useful when a corresponding Wait processor needs to know the number of occurrences of different types of events
 , such as success or failure, or destination data source names, etc.<br/><strong>Supports Expression Language: true</strong></td></tr><tr><td id="name"><strong>Signal Counter Delta</strong></td><td id="default-value">1</td><td id="allowable-values"></td><td id="description">A value, or the results of an Attribute Expression Language statement, which will be evaluated against a FlowFile in order to determine the signal counter delta. Specify how much the counter should increase. For example, if multiple signal events are processed at upstream flow in batch oriented way, the number of events processed can be notified with this property at once. Zero (0) has a special meaning, it clears target count back to 0, which is especially useful when used with Wait Releasable FlowFile Count = Zero (0) mode, to provide 'open-close-gate' type of flow control. One (1) can open a corresponding Wait processor, and Zero (0) can negate it as if closing a gate.<br/><strong>Supports Expression Language:
  true</strong></td></tr><tr><td id="name"><strong>Signal Buffer Count</strong></td><td id="default-value">1</td><td id="allowable-values"></td><td id="description">Specify the maximum number of incoming flow files that can be buffered until signals are notified to cache service. The more buffer can provide the better performance, as it reduces the number of interactions with cache service by grouping signals by signal identifier when multiple incoming flow files share the same signal identifier.</td></tr><tr><td id="name"><strong>Distributed Cache Service</strong></td><td id="default-value"></td><td id="allowable-values"><strong>Controller Service API: </strong><br/>AtomicDistributedMapCacheClient<br/><strong>Implementations: </strong><a href="../../../nifi-redis-nar/1.4.0/org.apache.nifi.redis.service.RedisDistributedMapCacheClientService/index.html">RedisDistributedMapCacheClientService</a><br/><a href="../../../nifi-distributed-cache-services-nar/1.4.0/org.apache.nifi.distributed
 .cache.client.DistributedMapCacheClientService/index.html">DistributedMapCacheClientService</a></td><td id="description">The Controller Service that is used to cache release signals in order to release files queued at a corresponding Wait processor</td></tr><tr><td id="name">Attribute Cache Regex</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">Any attributes whose names match this regex will be stored in the distributed cache to be copied to any FlowFiles released from a corresponding Wait processor.  Note that the uuid attribute will not be cached regardless of this value.  If blank, no attributes will be cached.</td></tr></table><h3>Relationships: </h3><table id="relationships"><tr><th>Name</th><th>Description</th></tr><tr><td>success</td><td>All FlowFiles where the release signal has been successfully entered in the cache will be routed to this relationship</td></tr><tr><td>failure</td><td>When the cache cannot be reached, or if the Release Sig
 nal Identifier evaluates to null or empty, FlowFiles will be routed to this relationship</td></tr></table><h3>Reads Attributes: </h3>None specified.<h3>Writes Attributes: </h3><table id="writes-attributes"><tr><th>Name</th><th>Description</th></tr><tr><td>notified</td><td>All FlowFiles will have an attribute 'notified'. The value of this attribute is true, is the FlowFile is notified, otherwise false.</td></tr></table><h3>State management: </h3>This component does not store state.<h3>Restricted: </h3>This component is not restricted.<h3>Input requirement: </h3>This component requires an incoming relationship.<h3>See Also:</h3><p><a href="../../../nifi-distributed-cache-services-nar/1.4.0/org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService/index.html">DistributedMapCacheClientService</a>, <a href="../../../nifi-distributed-cache-services-nar/1.4.0/org.apache.nifi.distributed.cache.server.map.DistributedMapCacheServer/index.html">DistributedMapCacheServer</a>, <a
  href="../org.apache.nifi.processors.standard.Wait/index.html">Wait</a></p></body></html>
\ No newline at end of file

Added: nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.ParseCEF/index.html
URL: http://svn.apache.org/viewvc/nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.ParseCEF/index.html?rev=1811008&view=auto
==============================================================================
--- nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.ParseCEF/index.html (added)
+++ nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.ParseCEF/index.html Tue Oct  3 13:30:16 2017
@@ -0,0 +1,2 @@
+<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"></meta><title>ParseCEF</title><link rel="stylesheet" href="../../../../../css/component-usage.css" type="text/css"></link></head><script type="text/javascript">window.onload = function(){if(self==top) { document.getElementById('nameHeader').style.display = "inherit"; } }</script><body><h1 id="nameHeader" style="display: none;">ParseCEF</h1><h2>Description: </h2><p>Parses the contents of a CEF formatted message and adds attributes to the FlowFile for headers and extensions of the parts of the CEF message.
+Note: This Processor expects CEF messages WITHOUT the syslog headers (i.e. starting at "CEF:0"</p><h3>Tags: </h3><p>logs, cef, attributes, system, event, message</p><h3>Properties: </h3><p>In the list below, the names of required properties appear in <strong>bold</strong>. Any other properties (not in bold) are considered optional. The table also indicates any default values.</p><table id="properties"><tr><th>Name</th><th>Default Value</th><th>Allowable Values</th><th>Description</th></tr><tr><td id="name"><strong>Parsed fields destination</strong></td><td id="default-value">flowfile-content</td><td id="allowable-values"><ul><li>flowfile-content</li><li>flowfile-attribute</li></ul></td><td id="description">Indicates whether the results of the CEF parser are written to the FlowFile content or a FlowFile attribute; if using flowfile-attributeattribute, fields will be populated as attributes. If set to flowfile-content, the CEF extension field will be converted into a flat JSON object.
 </td></tr><tr><td id="name"><strong>Append raw message to JSON</strong></td><td id="default-value">true</td><td id="allowable-values"></td><td id="description">When using flowfile-content (i.e. JSON output), add the original CEF message to the resulting JSON object. The original message is added as a string to _raw.</td></tr><tr><td id="name"><strong>Timezone</strong></td><td id="default-value">Local Timezone (system Default)</td><td id="allowable-values"><ul><li>UTC</li><li>Local Timezone (system Default)</li></ul></td><td id="description">Timezone to be used when representing date fields. UTC will convert all dates to UTC, while Local Timezone will convert them to the timezone used by NiFi.</td></tr><tr><td id="name"><strong>DateTime Locale</strong></td><td id="default-value">en-US</td><td id="allowable-values"></td><td id="description">The IETF BCP 47 representation of the Locale to be used when parsing date fields with long or short month names (e.g. may &lt;en-US&gt; vs. mai. &
 lt;fr-FR&gt;. The defaultvalue is generally safe. Only change if having issues parsing CEF messages</td></tr></table><h3>Relationships: </h3><table id="relationships"><tr><th>Name</th><th>Description</th></tr><tr><td>success</td><td>Any FlowFile that is successfully parsed as a CEF message will be transferred to this Relationship.</td></tr><tr><td>failure</td><td>Any FlowFile that could not be parsed as a CEF message will be transferred to this Relationship without any attributes being added</td></tr></table><h3>Reads Attributes: </h3>None specified.<h3>Writes Attributes: </h3><table id="writes-attributes"><tr><th>Name</th><th>Description</th></tr><tr><td>cef.header.version</td><td>The version of the CEF message.</td></tr><tr><td>cef.header.deviceVendor</td><td>The Device Vendor of the CEF message.</td></tr><tr><td>cef.header.deviceProduct</td><td>The Device Product of the CEF message.</td></tr><tr><td>cef.header.deviceVersion</td><td>The Device Version of the CEF message.</td></tr>
 <tr><td>cef.header.deviceEventClassId</td><td>The Device Event Class ID of the CEF message.</td></tr><tr><td>cef.header.name</td><td>The name of the CEF message.</td></tr><tr><td>cef.header.severity</td><td>The severity of the CEF message.</td></tr><tr><td>cef.extension.*</td><td>The key and value generated by the parsing of the message.</td></tr></table><h3>State management: </h3>This component does not store state.<h3>Restricted: </h3>This component is not restricted.<h3>Input requirement: </h3>This component requires an incoming relationship.<h3>See Also:</h3><p><a href="../org.apache.nifi.processors.standard.ParseSyslog/index.html">ParseSyslog</a></p></body></html>
\ No newline at end of file

Added: nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.ParseSyslog/index.html
URL: http://svn.apache.org/viewvc/nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.ParseSyslog/index.html?rev=1811008&view=auto
==============================================================================
--- nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.ParseSyslog/index.html (added)
+++ nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.ParseSyslog/index.html Tue Oct  3 13:30:16 2017
@@ -0,0 +1 @@
+<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"></meta><title>ParseSyslog</title><link rel="stylesheet" href="../../../../../css/component-usage.css" type="text/css"></link></head><script type="text/javascript">window.onload = function(){if(self==top) { document.getElementById('nameHeader').style.display = "inherit"; } }</script><body><h1 id="nameHeader" style="display: none;">ParseSyslog</h1><h2>Description: </h2><p>Attempts to parses the contents of a Syslog message in accordance to RFC5424 and RFC3164 formats and adds attributes to the FlowFile for each of the parts of the Syslog message.Note: Be mindfull that RFC3164 is informational and a wide range of different implementations are present in the wild. If messages fail parsing, considering using RFC5424 or using a generic parsing processors such as ExtractGrok.</p><h3>Tags: </h3><p>logs, syslog, attributes, system, event, message</p><h3>Properties: </h3><p>In the list below, the names of required properties appear in
  <strong>bold</strong>. Any other properties (not in bold) are considered optional. The table also indicates any default values.</p><table id="properties"><tr><th>Name</th><th>Default Value</th><th>Allowable Values</th><th>Description</th></tr><tr><td id="name"><strong>Character Set</strong></td><td id="default-value">UTF-8</td><td id="allowable-values"></td><td id="description">Specifies which character set of the Syslog messages</td></tr></table><h3>Relationships: </h3><table id="relationships"><tr><th>Name</th><th>Description</th></tr><tr><td>success</td><td>Any FlowFile that is successfully parsed as a Syslog message will be to this Relationship.</td></tr><tr><td>failure</td><td>Any FlowFile that could not be parsed as a Syslog message will be transferred to this Relationship without any attributes being added</td></tr></table><h3>Reads Attributes: </h3>None specified.<h3>Writes Attributes: </h3><table id="writes-attributes"><tr><th>Name</th><th>Description</th></tr><tr><td>sysl
 og.priority</td><td>The priority of the Syslog message.</td></tr><tr><td>syslog.severity</td><td>The severity of the Syslog message derived from the priority.</td></tr><tr><td>syslog.facility</td><td>The facility of the Syslog message derived from the priority.</td></tr><tr><td>syslog.version</td><td>The optional version from the Syslog message.</td></tr><tr><td>syslog.timestamp</td><td>The timestamp of the Syslog message.</td></tr><tr><td>syslog.hostname</td><td>The hostname or IP address of the Syslog message.</td></tr><tr><td>syslog.sender</td><td>The hostname of the Syslog server that sent the message.</td></tr><tr><td>syslog.body</td><td>The body of the Syslog message, everything after the hostname.</td></tr></table><h3>State management: </h3>This component does not store state.<h3>Restricted: </h3>This component is not restricted.<h3>Input requirement: </h3>This component requires an incoming relationship.<h3>See Also:</h3><p><a href="../org.apache.nifi.processors.standard.Lis
 tenSyslog/index.html">ListenSyslog</a>, <a href="../org.apache.nifi.processors.standard.PutSyslog/index.html">PutSyslog</a></p></body></html>
\ No newline at end of file

Added: nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.PartitionRecord/additionalDetails.html
URL: http://svn.apache.org/viewvc/nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.PartitionRecord/additionalDetails.html?rev=1811008&view=auto
==============================================================================
--- nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.PartitionRecord/additionalDetails.html (added)
+++ nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.PartitionRecord/additionalDetails.html Tue Oct  3 13:30:16 2017
@@ -0,0 +1,190 @@
+<!DOCTYPE html>
+<html lang="en">
+    <!--
+      Licensed to the Apache Software Foundation (ASF) under one or more
+      contributor license agreements.  See the NOTICE file distributed with
+      this work for additional information regarding copyright ownership.
+      The ASF licenses this file to You under the Apache License, Version 2.0
+      (the "License"); you may not use this file except in compliance with
+      the License.  You may obtain a copy of the License at
+          http://www.apache.org/licenses/LICENSE-2.0
+      Unless required by applicable law or agreed to in writing, software
+      distributed under the License is distributed on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+      See the License for the specific language governing permissions and
+      limitations under the License.
+    -->
+    <head>
+        <meta charset="utf-8" />
+        <title>PartitionRecord</title>
+
+        <link rel="stylesheet" href="../../../../../css/component-usage.css" type="text/css" />
+    </head>
+
+    <body>
+    	<p>
+    		PartitionRecord allows the user to separate out records in a FlowFile such that each outgoing FlowFile
+    		consists only of records that are "alike." To define what it means for two records to be alike, the Processor
+    		makes use of NiFi's <a href="../../../../../html/record-path-guide.html">RecordPath</a> DSL.
+    	</p>
+    	
+    	<p>
+    		In order to make the Processor valid, at least one user-defined property must be added to the Processor.
+    		The value of the property must be a valid RecordPath. Expression Language is supported and will be evaluated before
+    		attempting to compile the RecordPath. However, if Expression Language is used, the Processor is not able to validate
+    		the RecordPath before-hand and may result in having FlowFiles fail processing if the RecordPath is not valid when being
+    		used.
+    	</p>
+    	
+    	<p>
+    		Once one or more RecordPath's have been added, those RecordPath's are evaluated against each Record in an incoming FlowFile.
+    		In order for Record A and Record B to be considered "like records," both of them must have the same value for all RecordPath's
+    		that are configured. Only the values that are returned by the RecordPath are held in Java's heap. The records themselves are written
+    		immediately to the FlowFile content. This means that for most cases, heap usage is not a concern. However, if the RecordPath points
+    		to a large Record field that is different for each record in a FlowFile, then heap usage may be an important consideration. In such
+    		cases, SplitRecord may be useful to split a large FlowFile into smaller FlowFiles before partitioning.
+    	</p>
+    	
+    	<p>
+    		Once a FlowFile has been written, we know that all of the Records within that FlowFile have the same value for the fields that are
+    		described by the configured RecordPath's. As a result, this means that we can promote those values to FlowFile Attributes. We do so
+    		by looking at the name of the property to which each RecordPath belongs. For example, if we have a property named <code>country</code>
+    		with a value of <code>/geo/country/name</code>, then each outbound FlowFile will have an attribute named <code>country</code> with the
+    		value of the <code>/geo/country/name</code> field. The addition of these attributes makes it very easy to perform tasks such as routing,
+    		or referencing the value in another Processor that can be used for configuring where to send the data, etc.
+    		However, for any RecordPath whose value is not a scalar value (i.e., the value is of type Array, Map, or Record), no attribute will be added.
+    	</p>
+    	
+    	
+    	
+    	<h2>Examples</h2>
+    	
+    	<p>
+    		To better understand how this Processor works, we will lay out a few examples. For the sake of these examples, let's assume that our input
+    		data is JSON formatted and looks like this:
+    	</p>
+
+<code>
+<pre>
+[ {
+  "name": "John Doe",
+  "dob": "11/30/1976",
+  "favorites": [ "spaghetti", "basketball", "blue" ],
+  "locations": {
+  	"home": {
+  		"number": 123,
+  		"street": "My Street",
+  		"city": "New York",
+  		"state": "NY",
+  		"country": "US"
+  	},
+  	"work": {
+  		"number": 321,
+  		"street": "Your Street",
+  		"city": "New York",
+  		"state": "NY",
+  		"country": "US"
+  	}
+  }
+}, {
+  "name": "Jane Doe",
+  "dob": "10/04/1979",
+  "favorites": [ "spaghetti", "football", "red" ],
+  "locations": {
+  	"home": {
+  		"number": 123,
+  		"street": "My Street",
+  		"city": "New York",
+  		"state": "NY",
+  		"country": "US"
+  	},
+  	"work": {
+  		"number": 456,
+  		"street": "Our Street",
+  		"city": "New York",
+  		"state": "NY",
+  		"country": "US"
+  	}
+  }
+}, {
+  "name": "Jacob Doe",
+  "dob": "04/02/2012",
+  "favorites": [ "chocolate", "running", "yellow" ],
+  "locations": {
+  	"home": {
+  		"number": 123,
+  		"street": "My Street",
+  		"city": "New York",
+  		"state": "NY",
+  		"country": "US"
+  	},
+  	"work": null
+  }
+}, {
+  "name": "Janet Doe",
+  "dob": "02/14/2007",
+  "favorites": [ "spaghetti", "reading", "white" ],
+  "locations": {
+  	"home": {
+  		"number": 1111,
+  		"street": "Far Away",
+  		"city": "San Francisco",
+  		"state": "CA",
+  		"country": "US"
+  	},
+  	"work": null
+  }
+}]
+</pre>
+</code>
+
+
+    	<h3>Example 1 - Partition By Simple Field</h3>
+    	
+    	<p>
+    		For a simple case, let's partition all of the records based on the state that they live in.
+    		We can add a property named <code>state</code> with a value of <code>/locations/home/state</code>.
+    		The result will be that we will have two outbound FlowFiles. The first will contain an attribute with the name
+    		<code>state</code> and a value of <code>NY</code>. This FlowFile will consist of 3 records: John Doe, Jane Doe, and Jacob Doe.
+    		The second FlowFile will consist of a single record for Janet Doe and will contain an attribute named <code>state</code> that
+    		has a value of <code>CA</code>.
+    	</p>
+    	
+    	
+    	<h3>Example 2 - Partition By Nullable Value</h3>
+    	
+    	<p>
+    		In the above example, there are three different values for the work location. If we use a RecordPath of <code>/locations/work/state</code>
+    		with a property name of <code>state</code>, then we will end up with two different FlowFiles. The first will contain records for John Doe and Jane Doe
+    		because they have the same value for the given RecordPath. This FlowFile will have an attribute named <code>state</code> with a value of <code>NY</code>.
+    	</p>
+    	<p>
+    		The second FlowFile will contain the two records for Jacob Doe and Janet Doe, because the RecordPath will evaluate
+    		to <code>null</code> for both of them.  This FlowFile will have no <code>state</code> attribute (unless such an attribute existed on the incoming FlowFile,
+    		in which case its value will be unaltered).
+    	</p>
+    	
+    	
+    	<h3>Example 3 - Partition By Multiple Values</h3>
+    	
+    	<p>
+    		Now let's say that we want to partition records based on multiple different fields. We now add two properties to the PartitionRecord processor.
+    		The first property is named <code>home</code> and has a value of <code>/locations/home</code>. The second property is named <code>favorite.food</code>
+    		and has a value of <code>/favorites[0]</code> to reference the first element in the "favorites" array. 
+    	</p>
+    	
+    	<p>
+    		This will result in three different FlowFiles being created. The first FlowFile will contain records for John Doe and Jane Doe. If will contain an attribute
+    		named "favorite.food" with a value of "spaghetti." However, because the second RecordPath pointed to a Record field, no "home" attribute will be added.
+    		In this case, both of these records have the same value for both the first element of the "favorites" array
+    		and the same value for the home address. Janet Doe has the same value for the first element in the "favorites" array but has a different home address. Similarly,
+    		Jacob Doe has the same home address but a different value for the favorite food.
+    	</p>
+    	
+    	<p>
+    		The second FlowFile will consist of a single record: Jacob Doe. This FlowFile will have an attribute named "favorite.food" with a value of "chocolate."
+    		The third FlowFile will consist of a single record: Janet Doe. This FlowFile will have an attribute named "favorite.food" with a value of "spaghetti."
+    	</p>
+    	
+	</body>
+</html>
\ No newline at end of file

Added: nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.PartitionRecord/index.html
URL: http://svn.apache.org/viewvc/nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.PartitionRecord/index.html?rev=1811008&view=auto
==============================================================================
--- nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.PartitionRecord/index.html (added)
+++ nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.PartitionRecord/index.html Tue Oct  3 13:30:16 2017
@@ -0,0 +1 @@
+<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"></meta><title>PartitionRecord</title><link rel="stylesheet" href="../../../../../css/component-usage.css" type="text/css"></link></head><script type="text/javascript">window.onload = function(){if(self==top) { document.getElementById('nameHeader').style.display = "inherit"; } }</script><body><h1 id="nameHeader" style="display: none;">PartitionRecord</h1><h2>Description: </h2><p>Receives Record-oriented data (i.e., data that can be read by the configured Record Reader) and evaluates one or more RecordPaths against the each record in the incoming FlowFile. Each record is then grouped with other "like records" and a FlowFile is created for each group of "like records." What it means for two records to be "like records" is determined by user-defined properties. The user is required to enter at least one user-defined property whose value is a RecordPath. Two records are considered alike if they have the same value for all configu
 red RecordPaths. Because we know that all records in a given output FlowFile have the same value for the fields that are specified by the RecordPath, an attribute is added for each field. See Additional Details on the Usage page for more information and examples.</p><p><a href="additionalDetails.html">Additional Details...</a></p><h3>Tags: </h3><p>record, partition, recordpath, rpath, segment, split, group, bin, organize</p><h3>Properties: </h3><p>In the list below, the names of required properties appear in <strong>bold</strong>. Any other properties (not in bold) are considered optional. The table also indicates any default values.</p><table id="properties"><tr><th>Name</th><th>Default Value</th><th>Allowable Values</th><th>Description</th></tr><tr><td id="name"><strong>Record Reader</strong></td><td id="default-value"></td><td id="allowable-values"><strong>Controller Service API: </strong><br/>RecordReaderFactory<br/><strong>Implementations: </strong><a href="../../../nifi-record
 -serialization-services-nar/1.4.0/org.apache.nifi.csv.CSVReader/index.html">CSVReader</a><br/><a href="../../../nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.grok.GrokReader/index.html">GrokReader</a><br/><a href="../../../nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.avro.AvroReader/index.html">AvroReader</a><br/><a href="../../../nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.json.JsonTreeReader/index.html">JsonTreeReader</a><br/><a href="../../../nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.json.JsonPathReader/index.html">JsonPathReader</a><br/><a href="../../../nifi-scripting-nar/1.4.0/org.apache.nifi.record.script.ScriptedReader/index.html">ScriptedReader</a></td><td id="description">Specifies the Controller Service to use for reading incoming data</td></tr><tr><td id="name"><strong>Record Writer</strong></td><td id="default-value"></td><td id="allowable-values"><strong>Controller Service API: </strong><br/>RecordSetWr
 iterFactory<br/><strong>Implementations: </strong><a href="../../../nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.json.JsonRecordSetWriter/index.html">JsonRecordSetWriter</a><br/><a href="../../../nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.text.FreeFormTextRecordSetWriter/index.html">FreeFormTextRecordSetWriter</a><br/><a href="../../../nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.avro.AvroRecordSetWriter/index.html">AvroRecordSetWriter</a><br/><a href="../../../nifi-scripting-nar/1.4.0/org.apache.nifi.record.script.ScriptedRecordSetWriter/index.html">ScriptedRecordSetWriter</a><br/><a href="../../../nifi-record-serialization-services-nar/1.4.0/org.apache.nifi.csv.CSVRecordSetWriter/index.html">CSVRecordSetWriter</a></td><td id="description">Specifies the Controller Service to use for writing out the records</td></tr></table><h3>Dynamic Properties: </h3><p>Dynamic Properties allow the user to specify both the name and value of a prope
 rty.<table id="dynamic-properties"><tr><th>Name</th><th>Value</th><th>Description</th></tr><tr><td id="name">The name given to the dynamic property is the name of the attribute that will be used to denote the value of the associted RecordPath.</td><td id="value">A RecordPath that points to a field in the Record.</td><td>Each dynamic property represents a RecordPath that will be evaluated against each record in an incoming FlowFile. When the value of the RecordPath is determined for a Record, an attribute is added to the outgoing FlowFile. The name of the attribute is the same as the name of this property. The value of the attribute is the same as the value of the field in the Record that the RecordPath points to. Note that no attribute will be added if the value returned for the RecordPath is null or is not a scalar value (i.e., the value is an Array, Map, or Record).<br/><strong>Supports Expression Language: true</strong></td></tr></table></p><h3>Relationships: </h3><table id="rela
 tionships"><tr><th>Name</th><th>Description</th></tr><tr><td>success</td><td>FlowFiles that are successfully partitioned will be routed to this relationship</td></tr><tr><td>failure</td><td>If a FlowFile cannot be partitioned from the configured input format to the configured output format, the unchanged FlowFile will be routed to this relationship</td></tr><tr><td>original</td><td>Once all records in an incoming FlowFile have been partitioned, the original FlowFile is routed to this relationship.</td></tr></table><h3>Reads Attributes: </h3>None specified.<h3>Writes Attributes: </h3><table id="writes-attributes"><tr><th>Name</th><th>Description</th></tr><tr><td>record.count</td><td>The number of records in an outgoing FlowFile</td></tr><tr><td>mime.type</td><td>The MIME Type that the configured Record Writer indicates is appropriate</td></tr><tr><td>&lt;dynamic property name&gt;</td><td>For each dynamic property that is added, an attribute may be added to the FlowFile. See the descr
 iption for Dynamic Properties for more information.</td></tr></table><h3>State management: </h3>This component does not store state.<h3>Restricted: </h3>This component is not restricted.<h3>Input requirement: </h3>This component requires an incoming relationship.<h3>See Also:</h3><p><a href="../org.apache.nifi.processors.standard.ConvertRecord/index.html">ConvertRecord</a>, <a href="../org.apache.nifi.processors.standard.SplitRecord/index.html">SplitRecord</a>, <a href="../org.apache.nifi.processors.standard.UpdateRecord/index.html">UpdateRecord</a>, <a href="../org.apache.nifi.processors.standard.QueryRecord/index.html">QueryRecord</a></p></body></html>
\ No newline at end of file

Added: nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.PostHTTP/index.html
URL: http://svn.apache.org/viewvc/nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.PostHTTP/index.html?rev=1811008&view=auto
==============================================================================
--- nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.PostHTTP/index.html (added)
+++ nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.4.0/org.apache.nifi.processors.standard.PostHTTP/index.html Tue Oct  3 13:30:16 2017
@@ -0,0 +1 @@
+<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"></meta><title>PostHTTP</title><link rel="stylesheet" href="../../../../../css/component-usage.css" type="text/css"></link></head><script type="text/javascript">window.onload = function(){if(self==top) { document.getElementById('nameHeader').style.display = "inherit"; } }</script><body><h1 id="nameHeader" style="display: none;">PostHTTP</h1><h2>Description: </h2><p>Performs an HTTP Post with the content of the FlowFile</p><h3>Tags: </h3><p>http, https, remote, copy, archive</p><h3>Properties: </h3><p>In the list below, the names of required properties appear in <strong>bold</strong>. Any other properties (not in bold) are considered optional. The table also indicates any default values, whether a property supports the <a href="../../../../../html/expression-language-guide.html">NiFi Expression Language</a>, and whether a property is considered "sensitive", meaning that its value will be encrypted. Before entering a value in a
  sensitive property, ensure that the <strong>nifi.properties</strong> file has an entry for the property <strong>nifi.sensitive.props.key</strong>.</p><table id="properties"><tr><th>Name</th><th>Default Value</th><th>Allowable Values</th><th>Description</th></tr><tr><td id="name"><strong>URL</strong></td><td id="default-value"></td><td id="allowable-values"></td><td id="description">The URL to POST to. The first part of the URL must be static. However, the path of the URL may be defined using the Attribute Expression Language. For example, https://${hostname} is not valid, but https://1.1.1.1:8080/files/${nf.file.name} is valid.<br/><strong>Supports Expression Language: true</strong></td></tr><tr><td id="name">Max Batch Size</td><td id="default-value">100 MB</td><td id="allowable-values"></td><td id="description">If the Send as FlowFile property is true, specifies the max data size for a batch of FlowFiles to send in a single HTTP POST. If not specified, each FlowFile will be sent s
 eparately. If the Send as FlowFile property is false, this property is ignored</td></tr><tr><td id="name">Max Data to Post per Second</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">The maximum amount of data to send per second; this allows the bandwidth to be throttled to a specified data rate; if not specified, the data rate is not throttled</td></tr><tr><td id="name">SSL Context Service</td><td id="default-value"></td><td id="allowable-values"><strong>Controller Service API: </strong><br/>SSLContextService<br/><strong>Implementations: </strong><a href="../../../nifi-ssl-context-service-nar/1.4.0/org.apache.nifi.ssl.StandardSSLContextService/index.html">StandardSSLContextService</a><br/><a href="../../../nifi-ssl-context-service-nar/1.4.0/org.apache.nifi.ssl.StandardRestrictedSSLContextService/index.html">StandardRestrictedSSLContextService</a></td><td id="description">The Controller Service to use in order to obtain an SSL Context</td></tr><tr>
 <td id="name">Username</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">Username required to access the URL</td></tr><tr><td id="name">Password</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">Password required to access the URL<br/><strong>Sensitive Property: true</strong></td></tr><tr><td id="name"><strong>Send as FlowFile</strong></td><td id="default-value">false</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">If true, will package the FlowFile's contents and attributes together and send the FlowFile Package; otherwise, will send only the FlowFile's content</td></tr><tr><td id="name">Use Chunked Encoding</td><td id="default-value"></td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">Specifies whether or not to use Chunked Encoding to send the data. This property is ignored in the event the contents are compressed or sent as Flow
 Files.</td></tr><tr><td id="name"><strong>Compression Level</strong></td><td id="default-value">0</td><td id="allowable-values"></td><td id="description">Determines the GZIP Compression Level to use when sending the file; the value must be in the range of 0-9. A value of 0 indicates that the file will not be GZIP'ed</td></tr><tr><td id="name"><strong>Connection Timeout</strong></td><td id="default-value">30 sec</td><td id="allowable-values"></td><td id="description">How long to wait when attempting to connect to the remote server before giving up</td></tr><tr><td id="name"><strong>Data Timeout</strong></td><td id="default-value">30 sec</td><td id="allowable-values"></td><td id="description">How long to wait between receiving segments of data from the remote server before giving up and discarding the partial file</td></tr><tr><td id="name">Attributes to Send as HTTP Headers (Regex)</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">Specifies the Regul
 ar Expression that determines the names of FlowFile attributes that should be sent as HTTP Headers</td></tr><tr><td id="name">User Agent</td><td id="default-value">Apache-HttpClient/4.5.3 (Java/1.8.0_102)</td><td id="allowable-values"></td><td id="description">What to report as the User Agent when we connect to the remote server</td></tr><tr><td id="name">Proxy Host</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">The fully qualified hostname or IP address of the proxy server</td></tr><tr><td id="name">Proxy Port</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">The port of the proxy server</td></tr><tr><td id="name"><strong>Content-Type</strong></td><td id="default-value">${mime.type}</td><td id="allowable-values"></td><td id="description">The Content-Type to specify for the content of the FlowFile being POSTed if Send as FlowFile is false. In the case of an empty value after evaluating an expression language expr
 ession, Content-Type defaults to application/octet-stream<br/><strong>Supports Expression Language: true</strong></td></tr></table><h3>Relationships: </h3><table id="relationships"><tr><th>Name</th><th>Description</th></tr><tr><td>success</td><td>Files that are successfully send will be transferred to success</td></tr><tr><td>failure</td><td>Files that fail to send will transferred to failure</td></tr></table><h3>Reads Attributes: </h3>None specified.<h3>Writes Attributes: </h3>None specified.<h3>State management: </h3>This component does not store state.<h3>Restricted: </h3>This component is not restricted.<h3>Input requirement: </h3>This component requires an incoming relationship.</body></html>
\ No newline at end of file



Mime
View raw message