flume-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From build...@apache.org
Subject svn commit: r840975 [2/5] - in /websites/staging/flume/trunk/content: ./ .doctrees/ _sources/ releases/
Date Thu, 06 Dec 2012 19:05:40 GMT
Modified: websites/staging/flume/trunk/content/FlumeUserGuide.html
==============================================================================
--- websites/staging/flume/trunk/content/FlumeUserGuide.html (original)
+++ websites/staging/flume/trunk/content/FlumeUserGuide.html Thu Dec  6 19:05:38 2012
@@ -7,7 +7,7 @@
   <head>
     <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
     
-    <title>Flume 1.3.0-SNAPSHOT User Guide &mdash; Apache Flume</title>
+    <title>Flume 1.3.0 User Guide &mdash; Apache Flume</title>
     
     <link rel="stylesheet" href="_static/flume.css" type="text/css" />
     <link rel="stylesheet" href="_static/pygments.css" type="text/css" />
@@ -26,7 +26,7 @@
     <script type="text/javascript" src="_static/doctools.js"></script>
     <link rel="top" title="Apache Flume" href="index.html" />
     <link rel="up" title="Documentation" href="documentation.html" />
-    <link rel="next" title="Flume 1.x Developer Guide" href="FlumeDeveloperGuide.html" />
+    <link rel="next" title="Flume 1.3.0 Developer Guide" href="FlumeDeveloperGuide.html" />
     <link rel="prev" title="Documentation" href="documentation.html" /> 
   </head>
   <body>
@@ -59,8 +59,8 @@
         <div class="bodywrapper">
           <div class="body">
             
-  <div class="section" id="flume-1-3-0-snapshot-user-guide">
-<h1>Flume 1.3.0-SNAPSHOT User Guide<a class="headerlink" href="#flume-1-3-0-snapshot-user-guide" title="Permalink to this headline">¶</a></h1>
+  <div class="section" id="flume-1-3-0-user-guide">
+<h1>Flume 1.3.0 User Guide<a class="headerlink" href="#flume-1-3-0-user-guide" title="Permalink to this headline">¶</a></h1>
 <div class="section" id="introduction">
 <h2>Introduction<a class="headerlink" href="#introduction" title="Permalink to this headline">¶</a></h2>
 <div class="section" id="overview">
@@ -95,8 +95,8 @@ recognized by the target Flume source. F
 used to receive Avro events from Avro clients or other Flume agents in the flow
 that send events from an Avro sink. When a Flume source receives an event, it
 stores it into one or more channels. The channel is a passive store that keeps
-the event until it&#8217;s consumed by a Flume sink. The JDBC channel is one example
-&#8211; it uses a filesystem backed embedded database. The sink removes the event
+the event until it&#8217;s consumed by a Flume sink. The file channel is one example
+&#8211; it is backed by the local filesystem. The sink removes the event
 from the channel and puts it into an external repository like HDFS (via Flume
 HDFS sink) or forwards it to the Flume source of the next Flume agent (next
 hop) in the flow. The source and sink within the given agent run asynchronously
@@ -128,7 +128,7 @@ channel of the next hop.</p>
 <div class="section" id="recoverability">
 <h4>Recoverability<a class="headerlink" href="#recoverability" title="Permalink to this headline">¶</a></h4>
 <p>The events are staged in the channel, which manages recovery from failure.
-Flume supports a durable JDBC channel which is backed by a relational database.
+Flume supports a durable file channel which is backed by the local file system.
 There&#8217;s also a memory channel which simply stores the events in an in-memory
 queue, which is faster but any events still left in the memory channel when an
 agent process dies can&#8217;t be recovered.</p>
@@ -160,10 +160,10 @@ be set in the properties file of the hos
 <p>The agent needs to know what individual components to load and how they are
 connected in order to constitute the flow. This is done by listing the names of
 each of the sources, sinks and channels in the agent, and then specifying the
-connecting channel for each sink and source. For example, a agent flows events
-from an Avro source called avroWeb to HDFS sink hdfs-cluster1 via a JDBC
-channel called jdbc-channel. The configuration file will contain names of these
-components and jdbc-channel as a shared channel for both avroWeb source and
+connecting channel for each sink and source. For example, an agent flows events
+from an Avro source called avroWeb to HDFS sink hdfs-cluster1 via a file
+channel called file-channel. The configuration file will contain names of these
+components and file-channel as a shared channel for both avroWeb source and
 hdfs-cluster1 sink.</p>
 </div>
 <div class="section" id="starting-an-agent">
@@ -179,38 +179,44 @@ properties file.</p>
 </div>
 <div class="section" id="a-simple-example">
 <h4>A simple example<a class="headerlink" href="#a-simple-example" title="Permalink to this headline">¶</a></h4>
-<p>Here, we give an example configuration file, describing a single-node Flume deployment. This configuration lets a user generate events and subsequently logs them to the console.</p>
+<p>Here, we give an example configuration file, describing a single-node Flume deployment.
+This configuration lets a user generate events and subsequently logs them to the console.</p>
 <div class="highlight-properties"><div class="highlight"><pre><span class="c"># example.conf: A single-node Flume configuration</span>
 
 <span class="c"># Name the components on this agent</span>
-<span class="na">agent1.sources</span> <span class="o">=</span> <span class="s">source1</span>
-<span class="na">agent1.sinks</span> <span class="o">=</span> <span class="s">sink1</span>
-<span class="na">agent1.channels</span> <span class="o">=</span> <span class="s">channel1</span>
-
-<span class="c"># Describe/configure source1</span>
-<span class="na">agent1.sources.source1.type</span> <span class="o">=</span> <span class="s">netcat</span>
-<span class="na">agent1.sources.source1.bind</span> <span class="o">=</span> <span class="s">localhost</span>
-<span class="na">agent1.sources.source1.port</span> <span class="o">=</span> <span class="s">44444</span>
+<span class="na">a1.sources</span> <span class="o">=</span> <span class="s">r1</span>
+<span class="na">a1.sinks</span> <span class="o">=</span> <span class="s">k1</span>
+<span class="na">a1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+
+<span class="c"># Describe/configure the source</span>
+<span class="na">a1.sources.r1.type</span> <span class="o">=</span> <span class="s">netcat</span>
+<span class="na">a1.sources.r1.bind</span> <span class="o">=</span> <span class="s">localhost</span>
+<span class="na">a1.sources.r1.port</span> <span class="o">=</span> <span class="s">44444</span>
 
-<span class="c"># Describe sink1</span>
-<span class="na">agent1.sinks.sink1.type</span> <span class="o">=</span> <span class="s">logger</span>
+<span class="c"># Describe the sink</span>
+<span class="na">a1.sinks.k1.type</span> <span class="o">=</span> <span class="s">logger</span>
 
 <span class="c"># Use a channel which buffers events in memory</span>
-<span class="na">agent1.channels.channel1.type</span> <span class="o">=</span> <span class="s">memory</span>
-<span class="na">agent1.channels.channel1.capacity</span> <span class="o">=</span> <span class="s">1000</span>
-<span class="na">agent1.channels.channel1.transactionCapactiy</span> <span class="o">=</span> <span class="s">100</span>
+<span class="na">a1.channels.c1.type</span> <span class="o">=</span> <span class="s">memory</span>
+<span class="na">a1.channels.c1.capacity</span> <span class="o">=</span> <span class="s">1000</span>
+<span class="na">a1.channels.c1.transactionCapacity</span> <span class="o">=</span> <span class="s">100</span>
 
 <span class="c"># Bind the source and sink to the channel</span>
-<span class="na">agent1.sources.source1.channels</span> <span class="o">=</span> <span class="s">channel1</span>
-<span class="na">agent1.sinks.sink1.channel</span> <span class="o">=</span> <span class="s">channel1</span>
+<span class="na">a1.sources.r1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sinks.k1.channel</span> <span class="o">=</span> <span class="s">c1</span>
 </pre></div>
 </div>
-<p>This configuration defines a single agent, called <em>agent1</em>. <em>agent1</em> has a source that listens for data on port 44444, a channel that buffers event data in memory, and a sink that logs event data to the console. The configuration file names the various components, then describes their types and configuration parameters. A given configuration file might define several named agents; when a given Flume process is launched a flag is passed telling it which named agent to manifest.</p>
+<p>This configuration defines a single agent named a1. a1 has a source that listens for data on port 44444, a channel
+that buffers event data in memory, and a sink that logs event data to the console. The configuration file names the
+various components, then describes their types and configuration parameters. A given configuration file might define
+several named agents; when a given Flume process is launched a flag is passed telling it which named agent to manifest.</p>
 <p>Given this configuration file, we can start Flume as follows:</p>
-<div class="highlight-none"><div class="highlight"><pre>$ bin/flume-ng agent --conf-file example.conf --name agent1 -Dflume.root.logger=INFO,console
+<div class="highlight-none"><div class="highlight"><pre>$ bin/flume-ng agent --conf-file example.conf --name a1 -Dflume.root.logger=INFO,console
 </pre></div>
 </div>
-<p>Note that in a full deployment we would typically include one more option: <tt class="docutils literal"><span class="pre">--conf=&lt;conf-dir&gt;</span></tt>. The <tt class="docutils literal"><span class="pre">&lt;conf-dir&gt;</span></tt> directory would include a shell script <em>flume-env.sh</em> and potentially a log4j properties file. In this example, we pass a Java option to force Flume to log to the console and we go without a custom environment script.</p>
+<p>Note that in a full deployment we would typically include one more option: <tt class="docutils literal"><span class="pre">--conf=&lt;conf-dir&gt;</span></tt>.
+The <tt class="docutils literal"><span class="pre">&lt;conf-dir&gt;</span></tt> directory would include a shell script <em>flume-env.sh</em> and potentially a log4j properties file.
+In this example, we pass a Java option to force Flume to log to the console and we go without a custom environment script.</p>
 <p>From a separate terminal, we can then telnet port 44444 and send Flume an event:</p>
 <div class="highlight-properties"><pre>$ telnet localhost 44444
 Trying 127.0.0.1...
@@ -411,15 +417,15 @@ config to do that:</p>
 <div class="highlight-properties"><div class="highlight"><pre><span class="c"># list the sources, sinks and channels in the agent</span>
 <span class="na">agent_foo.sources</span> <span class="o">=</span> <span class="s">avro-AppSrv-source1 exec-tail-source2</span>
 <span class="na">agent_foo.sinks</span> <span class="o">=</span> <span class="s">hdfs-Cluster1-sink1 avro-forward-sink2</span>
-<span class="na">agent_foo.channels</span> <span class="o">=</span> <span class="s">mem-channel-1 jdbc-channel-2</span>
+<span class="na">agent_foo.channels</span> <span class="o">=</span> <span class="s">mem-channel-1 file-channel-2</span>
 
 <span class="c"># flow #1 configuration</span>
 <span class="na">agent_foo.sources.avro-AppSrv-source1.channels</span> <span class="o">=</span> <span class="s">mem-channel-1</span>
 <span class="na">agent_foo.sinks.hdfs-Cluster1-sink1.channel</span> <span class="o">=</span> <span class="s">mem-channel-1</span>
 
 <span class="c"># flow #2 configuration</span>
-<span class="na">agent_foo.sources.exec-tail-source2.channels</span> <span class="o">=</span> <span class="s">jdbc-channel-2</span>
-<span class="na">agent_foo.sinks.avro-forward-sink2.channel</span> <span class="o">=</span> <span class="s">jdbc-channel-2</span>
+<span class="na">agent_foo.sources.exec-tail-source2.channels</span> <span class="o">=</span> <span class="s">file-channel-2</span>
+<span class="na">agent_foo.sinks.avro-forward-sink2.channel</span> <span class="o">=</span> <span class="s">file-channel-2</span>
 </pre></div>
 </div>
 </div>
@@ -435,11 +441,11 @@ mounted for storage.</p>
 <div class="highlight-properties"><div class="highlight"><pre><span class="c"># list sources, sinks and channels in the agent</span>
 <span class="na">agent_foo.sources</span> <span class="o">=</span> <span class="s">avro-AppSrv-source</span>
 <span class="na">agent_foo.sinks</span> <span class="o">=</span> <span class="s">avro-forward-sink</span>
-<span class="na">agent_foo.channels</span> <span class="o">=</span> <span class="s">jdbc-channel</span>
+<span class="na">agent_foo.channels</span> <span class="o">=</span> <span class="s">file-channel</span>
 
 <span class="c"># define the flow</span>
-<span class="na">agent_foo.sources.avro-AppSrv-source.channels</span> <span class="o">=</span> <span class="s">jdbc-channel</span>
-<span class="na">agent_foo.sinks.avro-forward-sink.channel</span> <span class="o">=</span> <span class="s">jdbc-channel</span>
+<span class="na">agent_foo.sources.avro-AppSrv-source.channels</span> <span class="o">=</span> <span class="s">file-channel</span>
+<span class="na">agent_foo.sinks.avro-forward-sink.channel</span> <span class="o">=</span> <span class="s">file-channel</span>
 
 <span class="c"># avro sink properties</span>
 <span class="na">agent_foo.sources.avro-forward-sink.type</span> <span class="o">=</span> <span class="s">avro</span>
@@ -523,28 +529,59 @@ agent named agent_foo has a single avro 
 <div class="highlight-properties"><div class="highlight"><pre><span class="c"># list the sources, sinks and channels in the agent</span>
 <span class="na">agent_foo.sources</span> <span class="o">=</span> <span class="s">avro-AppSrv-source1</span>
 <span class="na">agent_foo.sinks</span> <span class="o">=</span> <span class="s">hdfs-Cluster1-sink1 avro-forward-sink2</span>
-<span class="na">agent_foo.channels</span> <span class="o">=</span> <span class="s">mem-channel-1 jdbc-channel-2</span>
+<span class="na">agent_foo.channels</span> <span class="o">=</span> <span class="s">mem-channel-1 file-channel-2</span>
 
 <span class="c"># set channels for source</span>
-<span class="na">agent_foo.sources.avro-AppSrv-source1.channels</span> <span class="o">=</span> <span class="s">mem-channel-1 jdbc-channel-2</span>
+<span class="na">agent_foo.sources.avro-AppSrv-source1.channels</span> <span class="o">=</span> <span class="s">mem-channel-1 file-channel-2</span>
 
 <span class="c"># set channel for sinks</span>
 <span class="na">agent_foo.sinks.hdfs-Cluster1-sink1.channel</span> <span class="o">=</span> <span class="s">mem-channel-1</span>
-<span class="na">agent_foo.sinks.avro-forward-sink2.channel</span> <span class="o">=</span> <span class="s">jdbc-channel-2</span>
+<span class="na">agent_foo.sinks.avro-forward-sink2.channel</span> <span class="o">=</span> <span class="s">file-channel-2</span>
 
 <span class="c"># channel selector configuration</span>
 <span class="na">agent_foo.sources.avro-AppSrv-source1.selector.type</span> <span class="o">=</span> <span class="s">multiplexing</span>
 <span class="na">agent_foo.sources.avro-AppSrv-source1.selector.header</span> <span class="o">=</span> <span class="s">State</span>
 <span class="na">agent_foo.sources.avro-AppSrv-source1.selector.mapping.CA</span> <span class="o">=</span> <span class="s">mem-channel-1</span>
-<span class="na">agent_foo.sources.avro-AppSrv-source1.selector.mapping.AZ</span> <span class="o">=</span> <span class="s">jdbc-channel-2</span>
-<span class="na">agent_foo.sources.avro-AppSrv-source1.selector.mapping.NY</span> <span class="o">=</span> <span class="s">mem-channel-1 jdbc-channel-2</span>
+<span class="na">agent_foo.sources.avro-AppSrv-source1.selector.mapping.AZ</span> <span class="o">=</span> <span class="s">file-channel-2</span>
+<span class="na">agent_foo.sources.avro-AppSrv-source1.selector.mapping.NY</span> <span class="o">=</span> <span class="s">mem-channel-1 file-channel-2</span>
 <span class="na">agent_foo.sources.avro-AppSrv-source1.selector.default</span> <span class="o">=</span> <span class="s">mem-channel-1</span>
 </pre></div>
 </div>
 <p>The selector checks for a header called &#8220;State&#8221;. If the value is &#8220;CA&#8221; then its
-sent to mem-channel-1, if its &#8220;AZ&#8221; then it goes to jdbc-channel-2 or if its
+sent to mem-channel-1, if its &#8220;AZ&#8221; then it goes to file-channel-2 or if its
 &#8220;NY&#8221; then both. If the &#8220;State&#8221; header is not set or doesn&#8217;t match any of the
 three, then it goes to mem-channel-1 which is designated as &#8216;default&#8217;.</p>
+<p>The selector also supports optional channels. To specify optional channels for
+a header, the config parameter &#8216;optional&#8217; is used in the following way:</p>
+<div class="highlight-properties"><div class="highlight"><pre><span class="c"># channel selector configuration</span>
+<span class="na">agent_foo.sources.avro-AppSrv-source1.selector.type</span> <span class="o">=</span> <span class="s">multiplexing</span>
+<span class="na">agent_foo.sources.avro-AppSrv-source1.selector.header</span> <span class="o">=</span> <span class="s">State</span>
+<span class="na">agent_foo.sources.avro-AppSrv-source1.selector.mapping.CA</span> <span class="o">=</span> <span class="s">mem-channel-1</span>
+<span class="na">agent_foo.sources.avro-AppSrv-source1.selector.mapping.AZ</span> <span class="o">=</span> <span class="s">file-channel-2</span>
+<span class="na">agent_foo.sources.avro-AppSrv-source1.selector.mapping.NY</span> <span class="o">=</span> <span class="s">mem-channel-1 file-channel-2</span>
+<span class="na">agent_foo.sources.avro-AppSrv-source1.selector.optional.CA</span> <span class="o">=</span> <span class="s">mem-channel-1 file-channel-2</span>
+<span class="na">agent_foo.sources.avro-AppSrv-source1.selector.mapping.AZ</span> <span class="o">=</span> <span class="s">file-channel-2</span>
+<span class="na">agent_foo.sources.avro-AppSrv-source1.selector.default</span> <span class="o">=</span> <span class="s">mem-channel-1</span>
+</pre></div>
+</div>
+<p>The selector will attempt to write to the required channels first and will fail
+the transaction if even one of these channels fails to consume the events. The
+transaction is reattempted on <strong>all</strong> of the channels. Once all required
+channels have consumed the events, then the selector will attempt to write to
+the optional channels. A failure by any of the optional channels to consume the
+event is simply ignored and not retried.</p>
+<p>If there is an overlap between the optional channels and required channels for a
+specific header, the channel is considered to be required, and a failure in the
+channel will cause the entire set of required channels to be retried. For
+instance, in the above example, for the header &#8220;CA&#8221; mem-channel-1 is considered
+to be a required channel even though it is marked both as required and optional,
+and a failure to write to this channel will cause that
+event to be retried on <strong>all</strong> channels configured for the selector.</p>
+<p>Note that if a header does not have any required channels, then the event will
+be written to the default channels and will be attempted to be written to the
+optional channels for that header. Specifying optional channels will still cause
+the event to be written to the default channels, if no required channels are
+specified.</p>
 </div>
 <div class="section" id="flume-sources">
 <h3>Flume Sources<a class="headerlink" href="#flume-sources" title="Permalink to this headline">¶</a></h3>
@@ -587,6 +624,14 @@ Required properties are in <strong>bold<
 <td>&#8211;</td>
 <td>Maximum number of worker threads to spawn</td>
 </tr>
+<tr class="row-odd"><td>selector.type</td>
+<td>&nbsp;</td>
+<td>&nbsp;</td>
+</tr>
+<tr class="row-even"><td>selector.*</td>
+<td>&nbsp;</td>
+<td>&nbsp;</td>
+</tr>
 <tr class="row-odd"><td>interceptors</td>
 <td>&#8211;</td>
 <td>Space separated list of interceptors</td>
@@ -597,13 +642,13 @@ Required properties are in <strong>bold<
 </tr>
 </tbody>
 </table>
-<p>Example for agent named <strong>agent_foo</strong>:</p>
-<div class="highlight-properties"><div class="highlight"><pre><span class="na">agent_foo.sources</span> <span class="o">=</span> <span class="s">avrosource-1</span>
-<span class="na">agent_foo.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
-<span class="na">agent_foo.sources.avrosource-1.type</span> <span class="o">=</span> <span class="s">avro</span>
-<span class="na">agent_foo.sources.avrosource-1.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
-<span class="na">agent_foo.sources.avrosource-1.bind</span> <span class="o">=</span> <span class="s">0.0.0.0</span>
-<span class="na">agent_foo.sources.avrosource-1.port</span> <span class="o">=</span> <span class="s">4141</span>
+<p>Example for agent named a1:</p>
+<div class="highlight-properties"><div class="highlight"><pre><span class="na">a1.sources</span> <span class="o">=</span> <span class="s">r1</span>
+<span class="na">a1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sources.r1.type</span> <span class="o">=</span> <span class="s">avro</span>
+<span class="na">a1.sources.r1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sources.r1.bind</span> <span class="o">=</span> <span class="s">0.0.0.0</span>
+<span class="na">a1.sources.r1.port</span> <span class="o">=</span> <span class="s">4141</span>
 </pre></div>
 </div>
 </div>
@@ -692,7 +737,9 @@ this doesn&#8217;t make sense, you need 
 never guarantee data has been received when using a unidirectional
 asynchronous interface such as ExecSource! As an extension of this
 warning - and to be completely clear - there is absolutely zero guarantee
-of event delivery when using this source. You have been warned.</p>
+of event delivery when using this source. For stronger reliability
+guarantees, consider the Spooling Directory Source or direct integration
+with Flume via the SDK.</p>
 </div>
 <div class="admonition note">
 <p class="first admonition-title">Note</p>
@@ -700,12 +747,112 @@ of event delivery when using this source
 Just use unix command <tt class="docutils literal"><span class="pre">tail</span> <span class="pre">-F</span> <span class="pre">/full/path/to/your/file</span></tt>. Parameter
 -F is better in this case than -f as it will also follow file rotation.</p>
 </div>
-<p>Example for agent named <strong>agent_foo</strong>:</p>
-<div class="highlight-properties"><div class="highlight"><pre><span class="na">agent_foo.sources</span> <span class="o">=</span> <span class="s">tailsource-1</span>
-<span class="na">agent_foo.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
-<span class="na">agent_foo.sources.tailsource-1.type</span> <span class="o">=</span> <span class="s">exec</span>
-<span class="na">agent_foo.sources.tailsource-1.command</span> <span class="o">=</span> <span class="s">tail -F /var/log/secure</span>
-<span class="na">agent_foo.sources.tailsource-1.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
+<p>Example for agent named a1:</p>
+<div class="highlight-properties"><div class="highlight"><pre><span class="na">a1.sources</span> <span class="o">=</span> <span class="s">r1</span>
+<span class="na">a1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sources.r1.type</span> <span class="o">=</span> <span class="s">exec</span>
+<span class="na">a1.sources.r1.command</span> <span class="o">=</span> <span class="s">tail -F /var/log/secure</span>
+<span class="na">a1.sources.r1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+</pre></div>
+</div>
+</div>
+<div class="section" id="spooling-directory-source">
+<h4>Spooling Directory Source<a class="headerlink" href="#spooling-directory-source" title="Permalink to this headline">¶</a></h4>
+<p>This source lets you ingest data by dropping files in a spooling directory on
+disk. <strong>Unlike other asynchronous sources, this source
+avoids data loss even if Flume is restarted or fails.</strong>
+Flume will watch the directory for new files and read then ingest them
+as they appear. After a given file has been fully read into the channel,
+it is renamed to indicate completion. This allows a cleaner process to remove
+completed files periodically. Note, however,
+that events may be duplicated if failures occur, consistent with the semantics
+offered by other Flume components. The channel optionally inserts the full path of
+the origin file into a header field of each event. This source buffers file data
+in memory during reads; be sure to set the <cite>bufferMaxLineLength</cite> option to a number
+greater than the longest line you expect to see in your input data.</p>
+<div class="admonition warning">
+<p class="first admonition-title">Warning</p>
+<p class="last">This channel expects that only immutable, uniquely named files
+are dropped in the spooling directory. If duplicate names are
+used, or files are modified while being read, the source will
+fail with an error message. For some use cases this may require
+adding unique identifiers (such as a timestamp) to log file names
+when they are copied into the spooling directory.</p>
+</div>
+<table border="1" class="docutils">
+<colgroup>
+<col width="22%" />
+<col width="15%" />
+<col width="63%" />
+</colgroup>
+<thead valign="bottom">
+<tr class="row-odd"><th class="head">Property Name</th>
+<th class="head">Default</th>
+<th class="head">Description</th>
+</tr>
+</thead>
+<tbody valign="top">
+<tr class="row-even"><td><strong>channels</strong></td>
+<td>&#8211;</td>
+<td>&nbsp;</td>
+</tr>
+<tr class="row-odd"><td><strong>type</strong></td>
+<td>&#8211;</td>
+<td>The component type name, needs to be <tt class="docutils literal"><span class="pre">spooldir</span></tt></td>
+</tr>
+<tr class="row-even"><td><strong>spoolDir</strong></td>
+<td>&#8211;</td>
+<td>The directory where log files will be spooled</td>
+</tr>
+<tr class="row-odd"><td>fileSuffix</td>
+<td>.COMPLETED</td>
+<td>Suffix to append to completely ingested files</td>
+</tr>
+<tr class="row-even"><td>fileHeader</td>
+<td>false</td>
+<td>Whether to add a header storing the filename</td>
+</tr>
+<tr class="row-odd"><td>fileHeaderKey</td>
+<td>file</td>
+<td>Header key to use when appending filename to header</td>
+</tr>
+<tr class="row-even"><td>batchSize</td>
+<td>10</td>
+<td>Granularity at which to batch transfer to the channel</td>
+</tr>
+<tr class="row-odd"><td>bufferMaxLines</td>
+<td>100</td>
+<td>Maximum number of lines the commit buffer can hold</td>
+</tr>
+<tr class="row-even"><td>bufferMaxLineLength</td>
+<td>5000</td>
+<td>Maximum length of a line in the commit buffer</td>
+</tr>
+<tr class="row-odd"><td>selector.type</td>
+<td>replicating</td>
+<td>replicating or multiplexing</td>
+</tr>
+<tr class="row-even"><td>selector.*</td>
+<td>&nbsp;</td>
+<td>Depends on the selector.type value</td>
+</tr>
+<tr class="row-odd"><td>interceptors</td>
+<td>&#8211;</td>
+<td>Space separated list of interceptors</td>
+</tr>
+<tr class="row-even"><td>interceptors.*</td>
+<td>&nbsp;</td>
+<td>&nbsp;</td>
+</tr>
+</tbody>
+</table>
+<p>Example for agent named a1:</p>
+<div class="highlight-properties"><div class="highlight"><pre><span class="na">a1.sources</span> <span class="o">=</span> <span class="s">r1</span>
+<span class="na">a1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sources.r1.type</span> <span class="o">=</span> <span class="s">spooldir</span>
+<span class="na">a1.sources.r1.spoolDir</span> <span class="o">=</span> <span class="s">/var/log/apache/flumeSpool</span>
+<span class="na">a1.sources.r1.fileHeader</span> <span class="o">=</span> <span class="s">true</span>
+<span class="na">a1.sources.r1.channels</span> <span class="o">=</span> <span class="s">c1</span>
 </pre></div>
 </div>
 </div>
@@ -750,31 +897,35 @@ Flume event and sent via the connected c
 <td>512</td>
 <td>Max line length per event body (in bytes)</td>
 </tr>
-<tr class="row-odd"><td>selector.type</td>
+<tr class="row-odd"><td>ack-every-event</td>
+<td>true</td>
+<td>Respond with an &#8220;OK&#8221; for every event received</td>
+</tr>
+<tr class="row-even"><td>selector.type</td>
 <td>replicating</td>
 <td>replicating or multiplexing</td>
 </tr>
-<tr class="row-even"><td>selector.*</td>
+<tr class="row-odd"><td>selector.*</td>
 <td>&nbsp;</td>
 <td>Depends on the selector.type value</td>
 </tr>
-<tr class="row-odd"><td>interceptors</td>
+<tr class="row-even"><td>interceptors</td>
 <td>&#8211;</td>
 <td>Space separated list of interceptors</td>
 </tr>
-<tr class="row-even"><td>interceptors.*</td>
+<tr class="row-odd"><td>interceptors.*</td>
 <td>&nbsp;</td>
 <td>&nbsp;</td>
 </tr>
 </tbody>
 </table>
-<p>Example for agent named <strong>agent_foo</strong>:</p>
-<div class="highlight-properties"><div class="highlight"><pre><span class="na">agent_foo.sources</span> <span class="o">=</span> <span class="s">ncsource-1</span>
-<span class="na">agent_foo.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
-<span class="na">agent_foo.sources.ncsource-1.type</span> <span class="o">=</span> <span class="s">netcat</span>
-<span class="na">agent_foo.sources.ncsource-1.bind</span> <span class="o">=</span> <span class="s">0.0.0.0</span>
-<span class="na">agent_foo.sources.ncsource-1.bind</span> <span class="o">=</span> <span class="s">6666</span>
-<span class="na">agent_foo.sources.ncsource-1.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
+<p>Example for agent named a1:</p>
+<div class="highlight-properties"><div class="highlight"><pre><span class="na">a1.sources</span> <span class="o">=</span> <span class="s">r1</span>
+<span class="na">a1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sources.r1.type</span> <span class="o">=</span> <span class="s">netcat</span>
+<span class="na">a1.sources.r1.bind</span> <span class="o">=</span> <span class="s">0.0.0.0</span>
+<span class="na">a1.sources.r1.bind</span> <span class="o">=</span> <span class="s">6666</span>
+<span class="na">a1.sources.r1.channels</span> <span class="o">=</span> <span class="s">c1</span>
 </pre></div>
 </div>
 </div>
@@ -820,24 +971,29 @@ Required properties are in <strong>bold<
 <td>&nbsp;</td>
 <td>&nbsp;</td>
 </tr>
+<tr class="row-even"><td>batchSize</td>
+<td>1</td>
+<td>&nbsp;</td>
+</tr>
 </tbody>
 </table>
-<p>Example for agent named <strong>agent_foo</strong>:</p>
-<div class="highlight-properties"><div class="highlight"><pre><span class="na">agent_foo.sources</span> <span class="o">=</span> <span class="s">ncsource-1</span>
-<span class="na">agent_foo.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
-<span class="na">agent_foo.sources.ncsource-1.type</span> <span class="o">=</span> <span class="s">seq</span>
-<span class="na">agent_foo.sources.ncsource-1.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
+<p>Example for agent named a1:</p>
+<div class="highlight-properties"><div class="highlight"><pre><span class="na">a1.sources</span> <span class="o">=</span> <span class="s">r1</span>
+<span class="na">a1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sources.r1.type</span> <span class="o">=</span> <span class="s">seq</span>
+<span class="na">a1.sources.r1.channels</span> <span class="o">=</span> <span class="s">c1</span>
 </pre></div>
 </div>
 </div>
 <div class="section" id="syslog-sources">
 <h4>Syslog Sources<a class="headerlink" href="#syslog-sources" title="Permalink to this headline">¶</a></h4>
 <p>Reads syslog data and generate Flume events. The UDP source treats an entire
-message as a single event. The TCP source on creates a new event for a string
-of characters separated by carriage return (&#8216;n&#8217;).</p>
+message as a single event. The TCP sources create a new event for each string
+of characters separated by a newline (&#8216;n&#8217;).</p>
 <p>Required properties are in <strong>bold</strong>.</p>
 <div class="section" id="syslog-tcp-source">
 <h5>Syslog TCP Source<a class="headerlink" href="#syslog-tcp-source" title="Permalink to this headline">¶</a></h5>
+<p>The original, tried-and-true syslog TCP source.</p>
 <table border="1" class="docutils">
 <colgroup>
 <col width="19%" />
@@ -869,7 +1025,7 @@ of characters separated by carriage retu
 </tr>
 <tr class="row-even"><td>eventSize</td>
 <td>2500</td>
-<td>&nbsp;</td>
+<td>Maximum size of a single event line, in bytes</td>
 </tr>
 <tr class="row-odd"><td>selector.type</td>
 <td>&nbsp;</td>
@@ -889,13 +1045,108 @@ of characters separated by carriage retu
 </tr>
 </tbody>
 </table>
-<p>For example, a syslog TCP source for agent named <strong>agent_foo</strong>:</p>
-<div class="highlight-properties"><div class="highlight"><pre><span class="na">agent_foo.sources</span> <span class="o">=</span> <span class="s">syslogsource-1</span>
-<span class="na">agent_foo.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
-<span class="na">agent_foo.sources.syslogsource-1.type</span> <span class="o">=</span> <span class="s">syslogtcp</span>
-<span class="na">agent_foo.sources.syslogsource-1.port</span> <span class="o">=</span> <span class="s">5140</span>
-<span class="na">agent_foo.sources.syslogsource-1.host</span> <span class="o">=</span> <span class="s">localhost</span>
-<span class="na">agent_foo.sources.syslogsource-1.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
+<p>For example, a syslog TCP source for agent named a1:</p>
+<div class="highlight-properties"><div class="highlight"><pre><span class="na">a1.sources</span> <span class="o">=</span> <span class="s">r1</span>
+<span class="na">a1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sources.r1.type</span> <span class="o">=</span> <span class="s">syslogtcp</span>
+<span class="na">a1.sources.r1.port</span> <span class="o">=</span> <span class="s">5140</span>
+<span class="na">a1.sources.r1.host</span> <span class="o">=</span> <span class="s">localhost</span>
+<span class="na">a1.sources.r1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+</pre></div>
+</div>
+</div>
+<div class="section" id="multiport-syslog-tcp-source">
+<h5>Multiport Syslog TCP Source<a class="headerlink" href="#multiport-syslog-tcp-source" title="Permalink to this headline">¶</a></h5>
+<p>This is a newer, faster, multi-port capable version of the Syslog TCP source.
+Note that the <tt class="docutils literal"><span class="pre">ports</span></tt> configuration setting has replaced <tt class="docutils literal"><span class="pre">port</span></tt>.
+Multi-port capability means that it can listen on many ports at once in an
+efficient manner. This source uses the Apache Mina library to do that.
+Provides support for RFC-3164 and many common RFC-5424 formatted messages.
+Also provides the capability to configure the character set used on a per-port
+basis.</p>
+<table border="1" class="docutils">
+<colgroup>
+<col width="7%" />
+<col width="6%" />
+<col width="87%" />
+</colgroup>
+<thead valign="bottom">
+<tr class="row-odd"><th class="head">Property Name</th>
+<th class="head">Default</th>
+<th class="head">Description</th>
+</tr>
+</thead>
+<tbody valign="top">
+<tr class="row-even"><td><strong>channels</strong></td>
+<td>&#8211;</td>
+<td>&nbsp;</td>
+</tr>
+<tr class="row-odd"><td><strong>type</strong></td>
+<td>&#8211;</td>
+<td>The component type name, needs to be <tt class="docutils literal"><span class="pre">multiport_syslogtcp</span></tt></td>
+</tr>
+<tr class="row-even"><td><strong>host</strong></td>
+<td>&#8211;</td>
+<td>Host name or IP address to bind to.</td>
+</tr>
+<tr class="row-odd"><td><strong>ports</strong></td>
+<td>&#8211;</td>
+<td>Space-separated list (one or more) of ports to bind to.</td>
+</tr>
+<tr class="row-even"><td>eventSize</td>
+<td>2500</td>
+<td>Maximum size of a single event line, in bytes.</td>
+</tr>
+<tr class="row-odd"><td>portHeader</td>
+<td>&#8211;</td>
+<td>If specified, the port number will be stored in the header of each event using the header name specified here. This allows for interceptors and channel selectors to customize routing logic based on the incoming port.</td>
+</tr>
+<tr class="row-even"><td>charset.default</td>
+<td>UTF-8</td>
+<td>Default character set used while parsing syslog events into strings.</td>
+</tr>
+<tr class="row-odd"><td>charset.port.&lt;port&gt;</td>
+<td>&#8211;</td>
+<td>Character set is configurable on a per-port basis.</td>
+</tr>
+<tr class="row-even"><td>batchSize</td>
+<td>100</td>
+<td>Maximum number of events to attempt to process per request loop. Using the default is usually fine.</td>
+</tr>
+<tr class="row-odd"><td>readBufferSize</td>
+<td>1024</td>
+<td>Size of the internal Mina read buffer. Provided for performance tuning. Using the default is usually fine.</td>
+</tr>
+<tr class="row-even"><td>numProcessors</td>
+<td>(auto-detected)</td>
+<td>Number of processors available on the system for use while processing messages. Default is to auto-detect # of CPUs using the Java Runtime API. Mina will spawn 2 request-processing threads per detected CPU, which is often reasonable.</td>
+</tr>
+<tr class="row-odd"><td>selector.type</td>
+<td>replicating</td>
+<td>replicating, multiplexing, or custom</td>
+</tr>
+<tr class="row-even"><td>selector.*</td>
+<td>&#8211;</td>
+<td>Depends on the <tt class="docutils literal"><span class="pre">selector.type</span></tt> value</td>
+</tr>
+<tr class="row-odd"><td>interceptors</td>
+<td>&#8211;</td>
+<td>Space separated list of interceptors.</td>
+</tr>
+<tr class="row-even"><td>interceptors.*</td>
+<td>&nbsp;</td>
+<td>&nbsp;</td>
+</tr>
+</tbody>
+</table>
+<p>For example, a multiport syslog TCP source for agent named a1:</p>
+<div class="highlight-properties"><div class="highlight"><pre><span class="na">a1.sources</span> <span class="o">=</span> <span class="s">r1</span>
+<span class="na">a1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sources.r1.type</span> <span class="o">=</span> <span class="s">multiport_syslogtcp</span>
+<span class="na">a1.sources.r1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sources.r1.host</span> <span class="o">=</span> <span class="s">0.0.0.0</span>
+<span class="na">a1.sources.r1.ports</span> <span class="o">=</span> <span class="s">10001 10002 10003</span>
+<span class="na">a1.sources.r1.portHeader</span> <span class="o">=</span> <span class="s">port</span>
 </pre></div>
 </div>
 </div>
@@ -948,13 +1199,120 @@ of characters separated by carriage retu
 </tr>
 </tbody>
 </table>
-<p>For example, a syslog UDP source for agent named <strong>agent_foo</strong>:</p>
-<div class="highlight-properties"><div class="highlight"><pre><span class="na">agent_foo.sources</span> <span class="o">=</span> <span class="s">syslogsource-1</span>
-<span class="na">agent_foo.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
-<span class="na">agent_foo.sources.syslogsource-1.type</span> <span class="o">=</span> <span class="s">syslogudp</span>
-<span class="na">agent_foo.sources.syslogsource-1.port</span> <span class="o">=</span> <span class="s">5140</span>
-<span class="na">agent_foo.sources.syslogsource-1.host</span> <span class="o">=</span> <span class="s">localhost</span>
-<span class="na">agent_foo.sources.syslogsource-1.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
+<p>For example, a syslog UDP source for agent named a1:</p>
+<div class="highlight-properties"><div class="highlight"><pre><span class="na">a1.sources</span> <span class="o">=</span> <span class="s">r1</span>
+<span class="na">a1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sources.r1.type</span> <span class="o">=</span> <span class="s">syslogudp</span>
+<span class="na">a1.sources.r1.port</span> <span class="o">=</span> <span class="s">5140</span>
+<span class="na">a1.sources.r1.host</span> <span class="o">=</span> <span class="s">localhost</span>
+<span class="na">a1.sources.r1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+</pre></div>
+</div>
+</div>
+</div>
+<div class="section" id="http-source">
+<h4>HTTP Source<a class="headerlink" href="#http-source" title="Permalink to this headline">¶</a></h4>
+<p>A source which accepts Flume Events by HTTP POST and GET. GET should be used
+for experimentation only. HTTP requests are converted into flume events by
+a pluggable &#8220;handler&#8221; which must implement the HTTPSourceHandler interface.
+This handler takes a HttpServletRequest and returns a list of
+flume events. All events handler from one Http request are committed to the channel
+in one transaction, thus allowing for increased efficiency on channels like
+the file channel. If the handler throws an exception this source will
+return a HTTP status of 400. If the channel is full, or the source is unable to
+append events to the channel, the source will return a HTTP 503 - Temporarily
+unavailable status.</p>
+<p>All events sent in one post request are considered to be one batch and
+inserted into the channel in one transaction.</p>
+<table border="1" class="docutils">
+<colgroup>
+<col width="11%" />
+<col width="34%" />
+<col width="54%" />
+</colgroup>
+<thead valign="bottom">
+<tr class="row-odd"><th class="head">Property Name</th>
+<th class="head">Default</th>
+<th class="head">Description</th>
+</tr>
+</thead>
+<tbody valign="top">
+<tr class="row-even"><td><strong>type</strong></td>
+<td>&nbsp;</td>
+<td>The FQCN of this class:  <tt class="docutils literal"><span class="pre">org.apache.flume.source.http.HTTPSource</span></tt></td>
+</tr>
+<tr class="row-odd"><td><strong>port</strong></td>
+<td>&#8211;</td>
+<td>The port the source should bind to.</td>
+</tr>
+<tr class="row-even"><td>handler</td>
+<td><tt class="docutils literal"><span class="pre">org.apache.flume.http.JSONHandler</span></tt></td>
+<td>The FQCN of the handler class.</td>
+</tr>
+<tr class="row-odd"><td>handler.*</td>
+<td>&#8211;</td>
+<td>Config parameters for the handler</td>
+</tr>
+<tr class="row-even"><td>selector.type</td>
+<td>replicating</td>
+<td>replicating or multiplexing</td>
+</tr>
+<tr class="row-odd"><td>selector.*</td>
+<td>&nbsp;</td>
+<td>Depends on the selector.type value</td>
+</tr>
+<tr class="row-even"><td>interceptors</td>
+<td>&#8211;</td>
+<td>Space separated list of interceptors</td>
+</tr>
+<tr class="row-odd"><td colspan="3">interceptors.*</td>
+</tr>
+</tbody>
+</table>
+<p>For example, a http source for agent named a1:</p>
+<div class="highlight-properties"><div class="highlight"><pre><span class="na">a1.sources</span> <span class="o">=</span> <span class="s">r1</span>
+<span class="na">a1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sources.r1.type</span> <span class="o">=</span> <span class="s">org.apache.flume.source.http.HTTPSource</span>
+<span class="na">a1.sources.r1.port</span> <span class="o">=</span> <span class="s">5140</span>
+<span class="na">a1.sources.r1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sources.r1.handler</span> <span class="o">=</span> <span class="s">org.example.rest.RestHandler</span>
+<span class="na">a1.sources.r1.handler.nickname</span> <span class="o">=</span> <span class="s">random props</span>
+</pre></div>
+</div>
+<div class="section" id="jsonhandler">
+<h5>JSONHandler<a class="headerlink" href="#jsonhandler" title="Permalink to this headline">¶</a></h5>
+<p>A handler is provided out of the box which can handle events represented in
+JSON format, and supports UTF-8, UTF-16 and UTF-32 character sets. The handler
+accepts an array of events (even if there is only one event, the event has to be
+sent in an array) and converts them to a Flume event based on the
+encoding specified in the request. If no encoding is specified, UTF-8 is assumed.
+The JSON handler supports UTF-8, UTF-16 and UTF-32.
+Events are represented as follows.</p>
+<div class="highlight-javascript"><div class="highlight"><pre><span class="p">[{</span>
+  <span class="s2">&quot;headers&quot;</span> <span class="o">:</span> <span class="p">{</span>
+             <span class="s2">&quot;timestamp&quot;</span> <span class="o">:</span> <span class="s2">&quot;434324343&quot;</span><span class="p">,</span>
+             <span class="s2">&quot;host&quot;</span> <span class="o">:</span> <span class="s2">&quot;random_host.example.com&quot;</span>
+             <span class="p">},</span>
+  <span class="s2">&quot;body&quot;</span> <span class="o">:</span> <span class="s2">&quot;random_body&quot;</span>
+  <span class="p">},</span>
+  <span class="p">{</span>
+  <span class="s2">&quot;headers&quot;</span> <span class="o">:</span> <span class="p">{</span>
+             <span class="s2">&quot;namenode&quot;</span> <span class="o">:</span> <span class="s2">&quot;namenode.example.com&quot;</span><span class="p">,</span>
+             <span class="s2">&quot;datanode&quot;</span> <span class="o">:</span> <span class="s2">&quot;random_datanode.example.com&quot;</span>
+             <span class="p">},</span>
+  <span class="s2">&quot;body&quot;</span> <span class="o">:</span> <span class="s2">&quot;really_random_body&quot;</span>
+  <span class="p">}]</span>
+</pre></div>
+</div>
+<p>To set the charset, the request must have content type specified as
+<tt class="docutils literal"><span class="pre">application/json;</span> <span class="pre">charset=UTF-8</span></tt> (replace UTF-8 with UTF-16 or UTF-32 as
+required).</p>
+<p>One way to create an event in the format expected by this handler, is to
+use JSONEvent provided in the Flume SDK and use Google Gson to create the JSON
+string using the Gson#fromJson(Object, Type)
+method. The type token to pass as the 2nd argument of this method
+for list of events can be created by:</p>
+<div class="highlight-java"><div class="highlight"><pre><span class="n">Type</span> <span class="n">type</span> <span class="o">=</span> <span class="k">new</span> <span class="n">TypeToken</span><span class="o">&lt;</span><span class="n">List</span><span class="o">&lt;</span><span class="n">JSONEvent</span><span class="o">&gt;&gt;()</span> <span class="o">{}.</span><span class="na">getType</span><span class="o">();</span>
 </pre></div>
 </div>
 </div>
@@ -1028,13 +1386,13 @@ channel by the legacy source.</p>
 </tr>
 </tbody>
 </table>
-<p>Example for agent named <strong>agent_foo</strong>:</p>
-<div class="highlight-properties"><div class="highlight"><pre><span class="na">agent_foo.sources</span> <span class="o">=</span> <span class="s">legacysource-1</span>
-<span class="na">agent_foo.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
-<span class="na">agent_foo.sources.legacysource-1.type</span> <span class="o">=</span> <span class="s">org.apache.flume.source.avroLegacy.AvroLegacySource</span>
-<span class="na">agent_foo.sources.legacysource-1.host</span> <span class="o">=</span> <span class="s">0.0.0.0</span>
-<span class="na">agent_foo.sources.legacysource-1.bind</span> <span class="o">=</span> <span class="s">6666</span>
-<span class="na">agent_foo.sources.legacysource-1.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
+<p>Example for agent named a1:</p>
+<div class="highlight-properties"><div class="highlight"><pre><span class="na">a1.sources</span> <span class="o">=</span> <span class="s">r1</span>
+<span class="na">a1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sources.r1.type</span> <span class="o">=</span> <span class="s">org.apache.flume.source.avroLegacy.AvroLegacySource</span>
+<span class="na">a1.sources.r1.host</span> <span class="o">=</span> <span class="s">0.0.0.0</span>
+<span class="na">a1.sources.r1.bind</span> <span class="o">=</span> <span class="s">6666</span>
+<span class="na">a1.sources.r1.channels</span> <span class="o">=</span> <span class="s">c1</span>
 </pre></div>
 </div>
 </div>
@@ -1043,8 +1401,8 @@ channel by the legacy source.</p>
 <table border="1" class="docutils">
 <colgroup>
 <col width="12%" />
-<col width="10%" />
-<col width="78%" />
+<col width="9%" />
+<col width="79%" />
 </colgroup>
 <thead valign="bottom">
 <tr class="row-odd"><th class="head">Property Name</th>
@@ -1059,7 +1417,7 @@ channel by the legacy source.</p>
 </tr>
 <tr class="row-odd"><td><strong>type</strong></td>
 <td>&#8211;</td>
-<td>The component type name, needs to be <tt class="docutils literal"><span class="pre">org.apache.source.thriftLegacy.ThriftLegacySource</span></tt></td>
+<td>The component type name, needs to be <tt class="docutils literal"><span class="pre">org.apache.flume.source.thriftLegacy.ThriftLegacySource</span></tt></td>
 </tr>
 <tr class="row-even"><td><strong>host</strong></td>
 <td>&#8211;</td>
@@ -1087,13 +1445,13 @@ channel by the legacy source.</p>
 </tr>
 </tbody>
 </table>
-<p>Example for agent named <strong>agent_foo</strong>:</p>
-<div class="highlight-properties"><div class="highlight"><pre><span class="na">agent_foo.sources</span> <span class="o">=</span> <span class="s">legacysource-1</span>
-<span class="na">agent_foo.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
-<span class="na">agent_foo.sources.legacysource-1.type</span> <span class="o">=</span> <span class="s">org.apache.source.thriftLegacy.ThriftLegacySource</span>
-<span class="na">agent_foo.sources.legacysource-1.host</span> <span class="o">=</span> <span class="s">0.0.0.0</span>
-<span class="na">agent_foo.sources.legacysource-1.bind</span> <span class="o">=</span> <span class="s">6666</span>
-<span class="na">agent_foo.sources.legacysource-1.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
+<p>Example for agent named a1:</p>
+<div class="highlight-properties"><div class="highlight"><pre><span class="na">a1.sources</span> <span class="o">=</span> <span class="s">r1</span>
+<span class="na">a1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sources.r1.type</span> <span class="o">=</span> <span class="s">org.apache.flume.source.thriftLegacy.ThriftLegacySource</span>
+<span class="na">a1.sources.r1.host</span> <span class="o">=</span> <span class="s">0.0.0.0</span>
+<span class="na">a1.sources.r1.bind</span> <span class="o">=</span> <span class="s">6666</span>
+<span class="na">a1.sources.r1.channels</span> <span class="o">=</span> <span class="s">c1</span>
 </pre></div>
 </div>
 </div>
@@ -1126,7 +1484,7 @@ when starting the Flume agent. The type 
 </tr>
 <tr class="row-even"><td>selector.type</td>
 <td>&nbsp;</td>
-<td>replicating or multiplexing</td>
+<td><tt class="docutils literal"><span class="pre">replicating</span></tt> or <tt class="docutils literal"><span class="pre">multiplexing</span></tt></td>
 </tr>
 <tr class="row-odd"><td>selector.*</td>
 <td>replicating</td>
@@ -1142,11 +1500,11 @@ when starting the Flume agent. The type 
 </tr>
 </tbody>
 </table>
-<p>Example for agent named <strong>agent_foo</strong>:</p>
-<div class="highlight-properties"><div class="highlight"><pre><span class="na">agent_foo.sources</span> <span class="o">=</span> <span class="s">legacysource-1</span>
-<span class="na">agent_foo.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
-<span class="na">agent_foo.sources.legacysource-1.type</span> <span class="o">=</span> <span class="s">your.namespace.YourClass</span>
-<span class="na">agent_foo.sources.legacysource-1.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
+<p>Example for agent named a1:</p>
+<div class="highlight-properties"><div class="highlight"><pre><span class="na">a1.sources</span> <span class="o">=</span> <span class="s">r1</span>
+<span class="na">a1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sources.r1.type</span> <span class="o">=</span> <span class="s">org.example.MySource</span>
+<span class="na">a1.sources.r1.channels</span> <span class="o">=</span> <span class="s">c1</span>
 </pre></div>
 </div>
 </div>
@@ -1181,15 +1539,23 @@ Required properties are in <strong>bold<
 <td>5</td>
 <td>Handing threads number in Thrift</td>
 </tr>
+<tr class="row-odd"><td>selector.type</td>
+<td>&nbsp;</td>
+<td>&nbsp;</td>
+</tr>
+<tr class="row-even"><td>selector.*</td>
+<td>&nbsp;</td>
+<td>&nbsp;</td>
+</tr>
 </tbody>
 </table>
-<p>Example for agent named <strong>agent_foo</strong>:</p>
-<div class="highlight-properties"><div class="highlight"><pre><span class="na">agent_foo.sources</span> <span class="o">=</span> <span class="s">scribesource-1</span>
-<span class="na">agent_foo.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
-<span class="na">agent_foo.sources.scribesource-1.type</span> <span class="o">=</span> <span class="s">org.apache.flume.source.scribe.ScribeSource</span>
-<span class="na">agent_foo.sources.scribesource-1.port</span> <span class="o">=</span> <span class="s">1463</span>
-<span class="na">agent_foo.sources.scribesource-1.workerThreads</span> <span class="o">=</span> <span class="s">5</span>
-<span class="na">agent_foo.sources.scribesource-1.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
+<p>Example for agent named a1:</p>
+<div class="highlight-properties"><div class="highlight"><pre><span class="na">a1.sources</span> <span class="o">=</span> <span class="s">r1</span>
+<span class="na">a1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sources.r1.type</span> <span class="o">=</span> <span class="s">org.apache.flume.source.scribe.ScribeSource</span>
+<span class="na">a1.sources.r1.port</span> <span class="o">=</span> <span class="s">1463</span>
+<span class="na">a1.sources.r1.workerThreads</span> <span class="o">=</span> <span class="s">5</span>
+<span class="na">a1.sources.r1.channels</span> <span class="o">=</span> <span class="s">c1</span>
 </pre></div>
 </div>
 </div>
@@ -1325,53 +1691,55 @@ this automatically is to use the Timesta
 <td>FlumeData</td>
 <td>Name prefixed to files created by Flume in hdfs directory</td>
 </tr>
-<tr class="row-even"><td>hdfs.rollInterval</td>
+<tr class="row-even"><td>hdfs.fileSuffix</td>
+<td>&#8211;</td>
+<td>Suffix to append to file (eg <tt class="docutils literal"><span class="pre">.avro</span></tt> - <em>NOTE: period is not automatically added</em>)</td>
+</tr>
+<tr class="row-odd"><td>hdfs.rollInterval</td>
 <td>30</td>
 <td>Number of seconds to wait before rolling current file
 (0 = never roll based on time interval)</td>
 </tr>
-<tr class="row-odd"><td>hdfs.rollSize</td>
+<tr class="row-even"><td>hdfs.rollSize</td>
 <td>1024</td>
 <td>File size to trigger roll, in bytes (0: never roll based on file size)</td>
 </tr>
-<tr class="row-even"><td>hdfs.rollCount</td>
+<tr class="row-odd"><td>hdfs.rollCount</td>
 <td>10</td>
 <td>Number of events written to file before it rolled
 (0 = never roll based on number of events)</td>
 </tr>
-<tr class="row-odd"><td>hdfs.batchSize</td>
-<td>1</td>
-<td>number of events written to file before it flushed to HDFS</td>
+<tr class="row-even"><td>hdfs.idleTimeout</td>
+<td>0</td>
+<td>Timeout after which inactive files get closed
+(0 = disable automatic closing of idle files)</td>
 </tr>
-<tr class="row-even"><td>hdfs.txnEventMax</td>
+<tr class="row-odd"><td>hdfs.batchSize</td>
 <td>100</td>
-<td>&nbsp;</td>
+<td>number of events written to file before it is flushed to HDFS</td>
 </tr>
-<tr class="row-odd"><td>hdfs.codeC</td>
+<tr class="row-even"><td>hdfs.codeC</td>
 <td>&#8211;</td>
 <td>Compression codec. one of following : gzip, bzip2, lzo, snappy</td>
 </tr>
-<tr class="row-even"><td>hdfs.fileType</td>
+<tr class="row-odd"><td>hdfs.fileType</td>
 <td>SequenceFile</td>
 <td>File format: currently <tt class="docutils literal"><span class="pre">SequenceFile</span></tt>, <tt class="docutils literal"><span class="pre">DataStream</span></tt> or <tt class="docutils literal"><span class="pre">CompressedStream</span></tt>
 (1)DataStream will not compress output file and please don&#8217;t set codeC
 (2)CompressedStream requires set hdfs.codeC with an available codeC</td>
 </tr>
-<tr class="row-odd"><td>hdfs.maxOpenFiles</td>
+<tr class="row-even"><td>hdfs.maxOpenFiles</td>
 <td>5000</td>
-<td>&nbsp;</td>
+<td>Allow only this number of open files. If this number is exceeded, the oldest file is closed.</td>
 </tr>
-<tr class="row-even"><td>hdfs.writeFormat</td>
+<tr class="row-odd"><td>hdfs.writeFormat</td>
 <td>&#8211;</td>
 <td>&#8220;Text&#8221; or &#8220;Writable&#8221;</td>
 </tr>
-<tr class="row-odd"><td>hdfs.appendTimeout</td>
-<td>1000</td>
-<td>&nbsp;</td>
-</tr>
 <tr class="row-even"><td>hdfs.callTimeout</td>
 <td>10000</td>
-<td>&nbsp;</td>
+<td>Number of milliseconds allowed for HDFS operations, such as open, write, flush, close.
+This number should be increased if many HDFS timeout operations are occurring.</td>
 </tr>
 <tr class="row-odd"><td>hdfs.threadsPoolSize</td>
 <td>10</td>
@@ -1389,21 +1757,29 @@ this automatically is to use the Timesta
 <td>&#8211;</td>
 <td>Kerberos keytab for accessing secure HDFS</td>
 </tr>
-<tr class="row-odd"><td>hdfs.round</td>
+<tr class="row-odd"><td>hdfs.proxyUser</td>
+<td>&nbsp;</td>
+<td>&nbsp;</td>
+</tr>
+<tr class="row-even"><td>hdfs.round</td>
 <td>false</td>
 <td>Should the timestamp be rounded down (if true, affects all time based escape sequences except %t)</td>
 </tr>
-<tr class="row-even"><td>hdfs.roundValue</td>
+<tr class="row-odd"><td>hdfs.roundValue</td>
 <td>1</td>
 <td>Rounded down to the highest multiple of this (in the unit configured using <tt class="docutils literal"><span class="pre">hdfs.roundUnit</span></tt>), less than current time.</td>
 </tr>
-<tr class="row-odd"><td>hdfs.roundUnit</td>
+<tr class="row-even"><td>hdfs.roundUnit</td>
 <td>second</td>
 <td>The unit of the round down value - <tt class="docutils literal"><span class="pre">second</span></tt>, <tt class="docutils literal"><span class="pre">minute</span></tt> or <tt class="docutils literal"><span class="pre">hour</span></tt>.</td>
 </tr>
+<tr class="row-odd"><td>hdfs.timeZone</td>
+<td>Local Time</td>
+<td>Name of the timezone that should be used for resolving the directory path, e.g. America/Los_Angeles.</td>
+</tr>
 <tr class="row-even"><td>serializer</td>
 <td><tt class="docutils literal"><span class="pre">TEXT</span></tt></td>
-<td>Other possible options include <tt class="docutils literal"><span class="pre">AVRO_EVENT</span></tt> or the
+<td>Other possible options include <tt class="docutils literal"><span class="pre">avro_event</span></tt> or the
 fully-qualified class name of an implementation of the
 <tt class="docutils literal"><span class="pre">EventSerializer.Builder</span></tt> interface.</td>
 </tr>
@@ -1413,19 +1789,20 @@ fully-qualified class name of an impleme
 </tr>
 </tbody>
 </table>
-<p>Example for agent named <strong>agent_foo</strong>:</p>
-<div class="highlight-properties"><div class="highlight"><pre><span class="na">agent_foo.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
-<span class="na">agent_foo.sinks</span> <span class="o">=</span> <span class="s">hdfsSink-1</span>
-<span class="na">agent_foo.sinks.hdfsSink-1.type</span> <span class="o">=</span> <span class="s">hdfs</span>
-<span class="na">agent_foo.sinks.hdfsSink-1.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
-<span class="na">agent_foo.sinks.hdfsSink-1.hdfs.path</span> <span class="o">=</span> <span class="s">/flume/events/%y-%m-%d/%H%M/%S</span>
-<span class="na">agent_foo.sinks.hdfsSink-1.hdfs.filePrefix</span> <span class="o">=</span> <span class="s">events-</span>
-<span class="na">agent_foo.sinks.hdfsSink-1.hdfs.round</span> <span class="o">=</span> <span class="s">true</span>
-<span class="na">agent_foo.sinks.hdfsSink-1.hdfs.roundValue</span> <span class="o">=</span> <span class="s">10</span>
-<span class="na">agent_foo.sinks.hdfsSink-1.hdfs.roundUnit</span> <span class="o">=</span> <span class="s">minute</span>
+<p>Example for agent named a1:</p>
+<div class="highlight-properties"><div class="highlight"><pre><span class="na">a1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sinks</span> <span class="o">=</span> <span class="s">k1</span>
+<span class="na">a1.sinks.k1.type</span> <span class="o">=</span> <span class="s">hdfs</span>
+<span class="na">a1.sinks.k1.channel</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sinks.k1.hdfs.path</span> <span class="o">=</span> <span class="s">/flume/events/%y-%m-%d/%H%M/%S</span>
+<span class="na">a1.sinks.k1.hdfs.filePrefix</span> <span class="o">=</span> <span class="s">events-</span>
+<span class="na">a1.sinks.k1.hdfs.round</span> <span class="o">=</span> <span class="s">true</span>
+<span class="na">a1.sinks.k1.hdfs.roundValue</span> <span class="o">=</span> <span class="s">10</span>
+<span class="na">a1.sinks.k1.hdfs.roundUnit</span> <span class="o">=</span> <span class="s">minute</span>
 </pre></div>
 </div>
-<p>The above configuration will round down the timestamp to the last 10th minute. For example, an event with timestamp 11:54:34 AM, June 12, 2012 will cause the hdfs path to become <tt class="docutils literal"><span class="pre">/flume/events/2012-06-12/1150/00</span></tt>.</p>
+<p>The above configuration will round down the timestamp to the last 10th minute. For example, an event with
+timestamp 11:54:34 AM, June 12, 2012 will cause the hdfs path to become <tt class="docutils literal"><span class="pre">/flume/events/2012-06-12/1150/00</span></tt>.</p>
 </div>
 <div class="section" id="logger-sink">
 <h4>Logger Sink<a class="headerlink" href="#logger-sink" title="Permalink to this headline">¶</a></h4>
@@ -1454,11 +1831,11 @@ Required properties are in <strong>bold<
 </tr>
 </tbody>
 </table>
-<p>Example for agent named <strong>agent_foo</strong>:</p>
-<div class="highlight-properties"><div class="highlight"><pre><span class="na">agent_foo.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
-<span class="na">agent_foo.sinks</span> <span class="o">=</span> <span class="s">loggerSink-1</span>
-<span class="na">agent_foo.sinks.loggerSink-1.type</span> <span class="o">=</span> <span class="s">logger</span>
-<span class="na">agent_foo.sinks.loggerSink-1.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
+<p>Example for agent named a1:</p>
+<div class="highlight-properties"><div class="highlight"><pre><span class="na">a1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sinks</span> <span class="o">=</span> <span class="s">k1</span>
+<span class="na">a1.sinks.k1.type</span> <span class="o">=</span> <span class="s">logger</span>
+<span class="na">a1.sinks.k1.channel</span> <span class="o">=</span> <span class="s">c1</span>
 </pre></div>
 </div>
 </div>
@@ -1512,13 +1889,13 @@ Required properties are in <strong>bold<
 </tr>
 </tbody>
 </table>
-<p>Example for agent named <strong>agent_foo</strong>:</p>
-<div class="highlight-properties"><div class="highlight"><pre><span class="na">agent_foo.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
-<span class="na">agent_foo.sinks</span> <span class="o">=</span> <span class="s">avroSink-1</span>
-<span class="na">agent_foo.sinks.avroSink-1.type</span> <span class="o">=</span> <span class="s">avro</span>
-<span class="na">agent_foo.sinks.avroSink-1.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
-<span class="na">agent_foo.sinks.avroSink-1.hostname</span> <span class="o">=</span> <span class="s">10.10.10.10</span>
-<span class="na">agent_foo.sinks.avroSink-1.port</span> <span class="o">=</span> <span class="s">4545</span>
+<p>Example for agent named a1:</p>
+<div class="highlight-properties"><div class="highlight"><pre><span class="na">a1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sinks</span> <span class="o">=</span> <span class="s">k1</span>
+<span class="na">a1.sinks.k1.type</span> <span class="o">=</span> <span class="s">avro</span>
+<span class="na">a1.sinks.k1.channel</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sinks.k1.hostname</span> <span class="o">=</span> <span class="s">10.10.10.10</span>
+<span class="na">a1.sinks.k1.port</span> <span class="o">=</span> <span class="s">4545</span>
 </pre></div>
 </div>
 </div>
@@ -1588,14 +1965,14 @@ backslash, like this: &#8220;\n&#8221;)<
 </tr>
 </tbody>
 </table>
-<p>Example for agent named <strong>agent_foo</strong>:</p>
-<div class="highlight-properties"><div class="highlight"><pre><span class="na">agent_foo.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
-<span class="na">agent_foo.sinks</span> <span class="o">=</span> <span class="s">ircSink-1</span>
-<span class="na">agent_foo.sinks.ircSink-1.type</span> <span class="o">=</span> <span class="s">irc</span>
-<span class="na">agent_foo.sinks.ircSink-1.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
-<span class="na">agent_foo.sinks.ircSink-1.hostname</span> <span class="o">=</span> <span class="s">irc.yourdomain.com</span>
-<span class="na">agent_foo.sinks.ircSink-1.nick</span> <span class="o">=</span> <span class="s">flume</span>
-<span class="na">agent_foo.sinks.ircSink-1.chan</span> <span class="o">=</span> <span class="s">#flume</span>
+<p>Example for agent named a1:</p>
+<div class="highlight-properties"><div class="highlight"><pre><span class="na">a1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sinks</span> <span class="o">=</span> <span class="s">k1</span>
+<span class="na">a1.sinks.k1.type</span> <span class="o">=</span> <span class="s">irc</span>
+<span class="na">a1.sinks.k1.channel</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sinks.k1.hostname</span> <span class="o">=</span> <span class="s">irc.yourdomain.com</span>
+<span class="na">a1.sinks.k1.nick</span> <span class="o">=</span> <span class="s">flume</span>
+<span class="na">a1.sinks.k1.chan</span> <span class="o">=</span> <span class="s">#flume</span>
 </pre></div>
 </div>
 </div>
@@ -1622,7 +1999,7 @@ Required properties are in <strong>bold<
 </tr>
 <tr class="row-odd"><td><strong>type</strong></td>
 <td>&#8211;</td>
-<td>The component type name, needs to be <tt class="docutils literal"><span class="pre">FILE_ROLL</span></tt>.</td>
+<td>The component type name, needs to be <tt class="docutils literal"><span class="pre">file_roll</span></tt>.</td>
 </tr>
 <tr class="row-even"><td><strong>sink.directory</strong></td>
 <td>&#8211;</td>
@@ -1634,16 +2011,20 @@ Required properties are in <strong>bold<
 </tr>
 <tr class="row-even"><td>sink.serializer</td>
 <td>TEXT</td>
-<td>Other possible options include AVRO_EVENT or the FQCN of an implementation of EventSerializer.Builder interface.</td>
+<td>Other possible options include <tt class="docutils literal"><span class="pre">avro_event</span></tt> or the FQCN of an implementation of EventSerializer.Builder interface.</td>
+</tr>
+<tr class="row-odd"><td>batchSize</td>
+<td>100</td>
+<td>&nbsp;</td>
 </tr>
 </tbody>
 </table>
-<p>Example for agent named <strong>agent_foo</strong>:</p>
-<div class="highlight-properties"><div class="highlight"><pre><span class="na">agent_foo.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
-<span class="na">agent_foo.sinks</span> <span class="o">=</span> <span class="s">fileSink-1</span>
-<span class="na">agent_foo.sinks.fileSink-1.type</span> <span class="o">=</span> <span class="s">FILE_ROLL</span>
-<span class="na">agent_foo.sinks.fileSink-1.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
-<span class="na">agent_foo.sinks.fileSink-1.sink.directory</span> <span class="o">=</span> <span class="s">/var/log/flume</span>
+<p>Example for agent named a1:</p>
+<div class="highlight-properties"><div class="highlight"><pre><span class="na">a1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sinks</span> <span class="o">=</span> <span class="s">k1</span>
+<span class="na">a1.sinks.k1.type</span> <span class="o">=</span> <span class="s">file_roll</span>
+<span class="na">a1.sinks.k1.channel</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sinks.k1.sink.directory</span> <span class="o">=</span> <span class="s">/var/log/flume</span>
 </pre></div>
 </div>
 </div>
@@ -1670,15 +2051,19 @@ Required properties are in <strong>bold<
 </tr>
 <tr class="row-odd"><td><strong>type</strong></td>
 <td>&#8211;</td>
-<td>The component type name, needs to be <tt class="docutils literal"><span class="pre">NULL</span></tt>.</td>
+<td>The component type name, needs to be <tt class="docutils literal"><span class="pre">null</span></tt>.</td>
+</tr>
+<tr class="row-even"><td>batchSize</td>
+<td>100</td>
+<td>&nbsp;</td>
 </tr>
 </tbody>
 </table>
-<p>Example for agent named <strong>agent_foo</strong>:</p>
-<div class="highlight-properties"><div class="highlight"><pre><span class="na">agent_foo.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
-<span class="na">agent_foo.sinks</span> <span class="o">=</span> <span class="s">nullSink-1</span>
-<span class="na">agent_foo.sinks.nullSink-1.type</span> <span class="o">=</span> <span class="s">NULL</span>
-<span class="na">agent_foo.sinks.nullSink-1.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
+<p>Example for agent named a1:</p>
+<div class="highlight-properties"><div class="highlight"><pre><span class="na">a1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sinks</span> <span class="o">=</span> <span class="s">k1</span>
+<span class="na">a1.sinks.k1.type</span> <span class="o">=</span> <span class="s">null</span>
+<span class="na">a1.sinks.k1.channel</span> <span class="o">=</span> <span class="s">c1</span>
 </pre></div>
 </div>
 </div>
@@ -1705,8 +2090,8 @@ Required properties are in <strong>bold<
 <table border="1" class="docutils">
 <colgroup>
 <col width="11%" />
-<col width="38%" />
-<col width="51%" />
+<col width="36%" />
+<col width="53%" />
 </colgroup>
 <thead valign="bottom">
 <tr class="row-odd"><th class="head">Property Name</th>
@@ -1721,7 +2106,7 @@ Required properties are in <strong>bold<
 </tr>
 <tr class="row-odd"><td><strong>type</strong></td>
 <td>&#8211;</td>
-<td>The component type name, needs to be <tt class="docutils literal"><span class="pre">org.apache.flume.sink.HBaseSink</span></tt></td>
+<td>The component type name, needs to be <tt class="docutils literal"><span class="pre">org.apache.flume.sink.hbase.HBaseSink</span></tt></td>
 </tr>
 <tr class="row-even"><td><strong>table</strong></td>
 <td>&#8211;</td>
@@ -1745,14 +2130,14 @@ Required properties are in <strong>bold<
 </tr>
 </tbody>
 </table>
-<p>Example for agent named <strong>agent_foo</strong>:</p>
-<div class="highlight-properties"><div class="highlight"><pre><span class="na">agent_foo.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
-<span class="na">agent_foo.sinks</span> <span class="o">=</span> <span class="s">hbaseSink-1</span>
-<span class="na">agent_foo.sinks.hbaseSink-1.type</span> <span class="o">=</span> <span class="s">org.apache.flume.sink.hbase.HBaseSink</span>
-<span class="na">agent_foo.sinks.hbaseSink-1.table</span> <span class="o">=</span> <span class="s">foo_table</span>
-<span class="na">agent_foo.sinks.hbaseSink-1.columnFamily</span> <span class="o">=</span> <span class="s">bar_cf</span>
-<span class="na">agent_foo.sinks.hbaseSink-1.serializer</span> <span class="o">=</span> <span class="s">org.apache.flume.sink.hbase.RegexHbaseEventSerializer</span>
-<span class="na">agent_foo.sinks.hbaseSink-1.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
+<p>Example for agent named a1:</p>
+<div class="highlight-properties"><div class="highlight"><pre><span class="na">a1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sinks</span> <span class="o">=</span> <span class="s">k1</span>
+<span class="na">a1.sinks.k1.type</span> <span class="o">=</span> <span class="s">org.apache.flume.sink.hbase.HBaseSink</span>
+<span class="na">a1.sinks.k1.table</span> <span class="o">=</span> <span class="s">foo_table</span>
+<span class="na">a1.sinks.k1.columnFamily</span> <span class="o">=</span> <span class="s">bar_cf</span>
+<span class="na">a1.sinks.k1.serializer</span> <span class="o">=</span> <span class="s">org.apache.flume.sink.hbase.RegexHbaseEventSerializer</span>
+<span class="na">a1.sinks.k1.channel</span> <span class="o">=</span> <span class="s">c1</span>
 </pre></div>
 </div>
 </div>
@@ -1787,7 +2172,7 @@ Required properties are in <strong>bold<
 </tr>
 <tr class="row-odd"><td><strong>type</strong></td>
 <td>&#8211;</td>
-<td>The component type name, needs to be <tt class="docutils literal"><span class="pre">org.apache.flume.sink.AsyncHBaseSink</span></tt></td>
+<td>The component type name, needs to be <tt class="docutils literal"><span class="pre">org.apache.flume.sink.hbase.AsyncHBaseSink</span></tt></td>
 </tr>
 <tr class="row-even"><td><strong>table</strong></td>
 <td>&#8211;</td>
@@ -1816,14 +2201,94 @@ all events in a transaction. If no timeo
 </tr>
 </tbody>
 </table>
-<p>Example for agent named <strong>agent_foo</strong>:</p>
-<div class="highlight-properties"><div class="highlight"><pre><span class="na">agent_foo.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
-<span class="na">agent_foo.sinks</span> <span class="o">=</span> <span class="s">hbaseSink-1</span>
-<span class="na">agent_foo.sinks.hbaseSink-1.type</span> <span class="o">=</span> <span class="s">org.apache.flume.sink.hbase.AsyncHBaseSink</span>
-<span class="na">agent_foo.sinks.hbaseSink-1.table</span> <span class="o">=</span> <span class="s">foo_table</span>
-<span class="na">agent_foo.sinks.hbaseSink-1.columnFamily</span> <span class="o">=</span> <span class="s">bar_cf</span>
-<span class="na">agent_foo.sinks.hbaseSink-1.serializer</span> <span class="o">=</span> <span class="s">org.apache.flume.sink.hbase.SimpleAsyncHbaseEventSerializer</span>
-<span class="na">agent_foo.sinks.hbaseSink-1.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
+<p>Example for agent named a1:</p>
+<div class="highlight-properties"><div class="highlight"><pre><span class="na">a1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sinks</span> <span class="o">=</span> <span class="s">k1</span>
+<span class="na">a1.sinks.k1.type</span> <span class="o">=</span> <span class="s">org.apache.flume.sink.hbase.AsyncHBaseSink</span>
+<span class="na">a1.sinks.k1.table</span> <span class="o">=</span> <span class="s">foo_table</span>
+<span class="na">a1.sinks.k1.columnFamily</span> <span class="o">=</span> <span class="s">bar_cf</span>
+<span class="na">a1.sinks.k1.serializer</span> <span class="o">=</span> <span class="s">org.apache.flume.sink.hbase.SimpleAsyncHbaseEventSerializer</span>
+<span class="na">a1.sinks.k1.channel</span> <span class="o">=</span> <span class="s">c1</span>
+</pre></div>
+</div>
+</div>
+<div class="section" id="elasticsearchsink">
+<h5>ElasticSearchSink<a class="headerlink" href="#elasticsearchsink" title="Permalink to this headline">¶</a></h5>
+<p>This sink writes data to ElasticSearch. A class implementing
+ElasticSearchEventSerializer which is specified by the configuration is used to convert the events into
+XContentBuilder which detail the fields and mappings which will be indexed. These are then then written
+to ElasticSearch. The sink will generate an index per day allowing easier management instead of dealing with
+a single large index
+The type is the FQCN: org.apache.flume.sink.elasticsearch.ElasticSearchSink
+Required properties are in <strong>bold</strong>.</p>
+<table border="1" class="docutils">
+<colgroup>
+<col width="9%" />
+<col width="36%" />
+<col width="56%" />
+</colgroup>
+<thead valign="bottom">
+<tr class="row-odd"><th class="head">Property Name</th>
+<th class="head">Default</th>
+<th class="head">Description</th>
+</tr>
+</thead>
+<tbody valign="top">
+<tr class="row-even"><td><strong>channel</strong></td>
+<td>&#8211;</td>
+<td>&nbsp;</td>
+</tr>
+<tr class="row-odd"><td><strong>type</strong></td>
+<td>&#8211;</td>
+<td>The component type name, needs to be <tt class="docutils literal"><span class="pre">org.apache.flume.sink.elasticsearch.ElasticSearchSink</span></tt></td>
+</tr>
+<tr class="row-even"><td><strong>hostNames</strong></td>
+<td>&#8211;</td>
+<td>Comma separated list of hostname:port, if the port is not present the default port &#8216;9300&#8217; will be used</td>
+</tr>
+<tr class="row-odd"><td>indexName</td>
+<td>flume</td>
+<td>The name of the index which the date will be appended to. Example &#8216;flume&#8217; -&gt; &#8216;flume-yyyy-MM-dd&#8217;</td>
+</tr>
+<tr class="row-even"><td>indexType</td>
+<td>logs</td>
+<td>The type to index the document to, defaults to &#8216;log&#8217;</td>
+</tr>
+<tr class="row-odd"><td>clusterName</td>
+<td>elasticsearch</td>
+<td>Name of the ElasticSearch cluster to connect to</td>
+</tr>
+<tr class="row-even"><td>batchSize</td>
+<td>100</td>
+<td>Number of events to be written per txn.</td>
+</tr>
+<tr class="row-odd"><td>ttl</td>
+<td>&#8211;</td>
+<td>TTL in days, when set will cause the expired documents to be deleted automatically,
+if not set documents will never be automatically deleted</td>
+</tr>
+<tr class="row-even"><td>serializer</td>
+<td>org.apache.flume.sink.elasticsearch.ElasticSearchDynamicSerializer</td>
+<td>&nbsp;</td>
+</tr>
+<tr class="row-odd"><td>serializer.*</td>
+<td>&#8211;</td>
+<td>Properties to be passed to the serializer.</td>
+</tr>
+</tbody>
+</table>
+<p>Example for agent named a1:</p>
+<div class="highlight-properties"><div class="highlight"><pre><span class="na">a1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sinks</span> <span class="o">=</span> <span class="s">k1</span>
+<span class="na">a1.sinks.k1.type</span> <span class="o">=</span> <span class="s">org.apache.flume.sink.elasticsearch.ElasticSearchSink</span>
+<span class="na">a1.sinks.k1.hostNames</span> <span class="o">=</span> <span class="s">127.0.0.1:9200,127.0.0.2:9300</span>
+<span class="na">a1.sinks.k1.indexName</span> <span class="o">=</span> <span class="s">foo_index</span>
+<span class="na">a1.sinks.k1.indexType</span> <span class="o">=</span> <span class="s">bar_type</span>
+<span class="na">a1.sinks.k1.clusterName</span> <span class="o">=</span> <span class="s">foobar_cluster</span>
+<span class="na">a1.sinks.k1.batchSize</span> <span class="o">=</span> <span class="s">500</span>
+<span class="na">a1.sinks.k1.ttl</span> <span class="o">=</span> <span class="s">5</span>
+<span class="na">a1.sinks.k1.serializer</span> <span class="o">=</span> <span class="s">org.apache.flume.sink.elasticsearch.ElasticSearchDynamicSerializer</span>
+<span class="na">a1.sinks.k1.channel</span> <span class="o">=</span> <span class="s">c1</span>
 </pre></div>
 </div>
 </div>
@@ -1857,11 +2322,11 @@ Required properties are in <strong>bold<
 </tr>
 </tbody>
 </table>
-<p>Example for agent named <strong>agent_foo</strong>:</p>
-<div class="highlight-properties"><div class="highlight"><pre><span class="na">agent_foo.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
-<span class="na">agent_foo.sinks</span> <span class="o">=</span> <span class="s">customSink-1</span>
-<span class="na">agent_foo.sinks.customSink-1.type</span> <span class="o">=</span> <span class="s">your.namespace.YourClass</span>
-<span class="na">agent_foo.sinks.customSink-1.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
+<p>Example for agent named a1:</p>
+<div class="highlight-properties"><div class="highlight"><pre><span class="na">a1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.sinks</span> <span class="o">=</span> <span class="s">k1</span>
+<span class="na">a1.sinks.k1.type</span> <span class="o">=</span> <span class="s">org.example.MySink</span>
+<span class="na">a1.sinks.k1.channel</span> <span class="o">=</span> <span class="s">c1</span>
 </pre></div>
 </div>
 </div>
@@ -1907,10 +2372,10 @@ Required properties are in <strong>bold<
 </tr>
 </tbody>
 </table>
-<p>Example for agent named <strong>agent_foo</strong>:</p>
-<div class="highlight-properties"><div class="highlight"><pre><span class="na">agent_foo.channels</span> <span class="o">=</span> <span class="s">memoryChannel-1</span>
-<span class="na">agent_foo.channels.memoryChannel-1.type</span> <span class="o">=</span> <span class="s">memory</span>
-<span class="na">agent_foo.channels.memoryChannel-1.capacity</span> <span class="o">=</span> <span class="s">1000</span>
+<p>Example for agent named a1:</p>
+<div class="highlight-properties"><div class="highlight"><pre><span class="na">a1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.channels.c1.type</span> <span class="o">=</span> <span class="s">memory</span>
+<span class="na">a1.channels.c1.capacity</span> <span class="o">=</span> <span class="s">1000</span>
 </pre></div>
 </div>
 </div>
@@ -1996,9 +2461,9 @@ READ_COMMITTED, SERIALIZABLE, REPEATABLE
 </tr>
 </tbody>
 </table>
-<p>Example for agent named <strong>agent_foo</strong>:</p>
-<div class="highlight-properties"><div class="highlight"><pre><span class="na">agent_foo.channels</span> <span class="o">=</span> <span class="s">jdbcChannel-1</span>
-<span class="na">agent_foo.channels.jdbcChannel-1.type</span> <span class="o">=</span> <span class="s">jdbc</span>
+<p>Example for agent named a1:</p>
+<div class="highlight-properties"><div class="highlight"><pre><span class="na">a1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.channels.c1.type</span> <span class="o">=</span> <span class="s">jdbc</span>
 </pre></div>
 </div>
 </div>
@@ -2049,6 +2514,18 @@ and performs better than the Recoverable
 <td>(0x20000000)</td>
 <td>Total amt (in bytes) of logs to keep, excluding the current log</td>
 </tr>
+<tr class="row-even"><td>capacity</td>
+<td>100</td>
+<td>&nbsp;</td>
+</tr>
+<tr class="row-odd"><td>transactionCapacity</td>
+<td>100</td>
+<td>&nbsp;</td>
+</tr>
+<tr class="row-even"><td>keep-alive</td>
+<td>3</td>
+<td>&nbsp;</td>
+</tr>
 </tbody>
 </table>
 </div>
@@ -2057,20 +2534,20 @@ and performs better than the Recoverable
 <p>Required properties are in <strong>bold</strong>.</p>
 <table border="1" class="docutils">
 <colgroup>
-<col width="19%" />
-<col width="30%" />
-<col width="52%" />
+<col width="35%" />
+<col width="24%" />
+<col width="41%" />
 </colgroup>
 <thead valign="bottom">
-<tr class="row-odd"><th class="head">Property Name</th>
-<th class="head">Default</th>
+<tr class="row-odd"><th class="head">Property Name         Default</th>
 <th class="head">Description</th>
+<th class="head">&nbsp;</th>
 </tr>
 </thead>
 <tbody valign="top">
 <tr class="row-even"><td><strong>type</strong></td>
 <td>&#8211;</td>
-<td>The component type name, needs to be <tt class="docutils literal"><span class="pre">FILE</span></tt>.</td>
+<td>The component type name, needs to be <tt class="docutils literal"><span class="pre">file</span></tt>.</td>
 </tr>
 <tr class="row-odd"><td>checkpointDir</td>
 <td>~/.flume/file-channel/checkpoint</td>
@@ -2104,6 +2581,46 @@ and performs better than the Recoverable
 <td>3</td>
 <td>Amount of time (in sec) to wait for a write operation</td>
 </tr>
+<tr class="row-odd"><td>checkpoint-timeout</td>
+<td>600</td>
+<td>Expert: Amount of time (in sec) to wait for a checkpoint</td>
+</tr>
+<tr class="row-even"><td>use-log-replay-v1</td>
+<td>false</td>
+<td>Expert: Use old replay logic</td>
+</tr>
+<tr class="row-odd"><td>use-fast-replay</td>
+<td>false</td>
+<td>Expert: Replay without using queue</td>
+</tr>
+<tr class="row-even"><td>encryption.activeKey</td>
+<td>&#8211;</td>
+<td>Key name used to encrypt new data</td>
+</tr>
+<tr class="row-odd"><td>encryption.cipherProvider</td>
+<td>&#8211;</td>
+<td>Cipher provider type, supported types: AESCTRNOPADDING</td>
+</tr>
+<tr class="row-even"><td>encryption.keyProvider</td>
+<td>&#8211;</td>
+<td>Key provider type, supported types: JCEKSFILE</td>
+</tr>
+<tr class="row-odd"><td>encryption.keyProvider.keyStoreFile</td>
+<td>&#8211;</td>
+<td>Path to the keystore file</td>
+</tr>
+<tr class="row-even"><td>encrpytion.keyProvider.keyStorePasswordFile</td>
+<td>&#8211;</td>
+<td>Path to the keystore password file</td>
+</tr>
+<tr class="row-odd"><td>encryption.keyProvider.keys</td>
+<td>&#8211;</td>
+<td>List of all keys (e.g. history of the activeKey setting)</td>
+</tr>
+<tr class="row-even"><td>encyption.keyProvider.keys.*.passwordFile</td>
+<td>&#8211;</td>
+<td>Path to the optional key password file</td>
+</tr>
 </tbody>
 </table>
 <div class="admonition note">
@@ -2120,11 +2637,53 @@ coupling it with a sink/source that batc
 be necessary to provide good performance where multiple disks are
 not available for checkpoint and data directories.</p>
 </div>
-<p>Example for agent named <strong>agent_foo</strong>:</p>
-<div class="highlight-properties"><div class="highlight"><pre><span class="na">agent_foo.channels</span> <span class="o">=</span> <span class="s">fileChannel-1</span>
-<span class="na">agent_foo.channels.fileChannel-1.type</span> <span class="o">=</span> <span class="s">file</span>
-<span class="na">agent_foo.channels.fileChannel-1.checkpointDir</span> <span class="o">=</span> <span class="s">/mnt/flume/checkpoint</span>
-<span class="na">agent_foo.channels.fileChannel-1.dataDirs</span> <span class="o">=</span> <span class="s">/mnt/flume/data</span>
+<p>Example for agent named a1:</p>
+<div class="highlight-properties"><div class="highlight"><pre><span class="na">a1.channels</span> <span class="o">=</span> <span class="s">c1</span>
+<span class="na">a1.channels.c1.type</span> <span class="o">=</span> <span class="s">file</span>
+<span class="na">a1.channels.c1.checkpointDir</span> <span class="o">=</span> <span class="s">/mnt/flume/checkpoint</span>
+<span class="na">a1.channels.c1.dataDirs</span> <span class="o">=</span> <span class="s">/mnt/flume/data</span>
+</pre></div>
+</div>
+<p><strong>Encryption</strong></p>
+<p>Below is a few sample configurations:</p>
+<p>Generating a key with a password seperate from the key store password:</p>
+<div class="highlight-bash"><div class="highlight"><pre>keytool -genseckey -alias key-0 -keypass keyPassword -keyalg AES <span class="se">\</span>
+  -keysize 128 -validity 9000 -keystore test.keystore <span class="se">\</span>
+  -storetype jceks -storepass keyStorePassword
+</pre></div>
+</div>
+<p>Generating a key with the password the same as the key store password:</p>
+<div class="highlight-bash"><div class="highlight"><pre>keytool -genseckey -alias key-1 -keyalg AES -keysize 128 -validity 9000 <span class="se">\</span>
+  -keystore src/test/resources/test.keystore -storetype jceks <span class="se">\</span>
+  -storepass keyStorePassword
+</pre></div>
+</div>
+<div class="highlight-properties"><div class="highlight"><pre><span class="na">a1.channels.c1.encryption.activeKey</span> <span class="o">=</span> <span class="s">key-0</span>
+<span class="na">a1.channels.c1.encryption.cipherProvider</span> <span class="o">=</span> <span class="s">AESCTRNOPADDING</span>
+<span class="na">a1.channels.c1.encryption.keyProvider</span> <span class="o">=</span> <span class="s">key-provider-0</span>
+<span class="na">a1.channels.c1.encryption.keyProvider</span> <span class="o">=</span> <span class="s">JCEKSFILE</span>
+<span class="na">a1.channels.c1.encryption.keyProvider.keyStoreFile</span> <span class="o">=</span> <span class="s">/path/to/my.keystore</span>
+<span class="na">a1.channels.c1.encryption.keyProvider.keyStorePasswordFile</span> <span class="o">=</span> <span class="s">/path/to/my.keystore.password</span>
+<span class="na">a1.channels.c1.encryption.keyProvider.keys</span> <span class="o">=</span> <span class="s">key-0</span>
+</pre></div>
+</div>
+<p>Let&#8217;s say you have aged key-0 out and new files should be encrypted with key-1:</p>
+<div class="highlight-properties"><div class="highlight"><pre><span class="na">a1.channels.c1.encryption.activeKey</span> <span class="o">=</span> <span class="s">key-1</span>
+<span class="na">a1.channels.c1.encryption.cipherProvider</span> <span class="o">=</span> <span class="s">AESCTRNOPADDING</span>
+<span class="na">a1.channels.c1.encryption.keyProvider</span> <span class="o">=</span> <span class="s">JCEKSFILE</span>
+<span class="na">a1.channels.c1.encryption.keyProvider.keyStoreFile</span> <span class="o">=</span> <span class="s">/path/to/my.keystore</span>

[... 1133 lines stripped ...]


Mime
View raw message