Mailing-List: contact flume-user-help@incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: flume-user@incubator.apache.org
MIME-Version: 1.0
In-Reply-To: 
 <CACmPTkCF45S5vg9cCjAO-7rFVENu+4JtcWKQsgB5MHW66B8f0Q@mail.gmail.com>
References: 
 <CAFaOAu2+AHTTcHP0GLhJ-477yMe9xB7uubAHBKyWzXjQdb68zg@mail.gmail.com>
 <CAHUddLPQJqpDL5ox3E4tjkrj6yd1pp5h9_HK0zV9p7-oWL7ntA@mail.gmail.com>
 <CACmPTkCF45S5vg9cCjAO-7rFVENu+4JtcWKQsgB5MHW66B8f0Q@mail.gmail.com>
From: Arvind Prabhakar <arvind@apache.org>
Date: Fri, 18 May 2012 14:16:23 -0700
Message-ID: 
 <CAHUddLMMmr--pdu4FzKvfyF0yiS7WA-DoL87ieLDnpQ0H-4YCQ@mail.gmail.com>
Subject: Re: New production setup
To: flume-user@incubator.apache.org
Content-Type: multipart/alternative; boundary=f46d0407143be6132f04c056128c

--f46d0407143be6132f04c056128c
Content-Type: text/plain; charset=ISO-8859-1

Hi Mahesh,

The concepts of Flume 1.x (NG) are different from Flume 0.9.x. For a quick
premier on the changed concepts and to understand them better, please
glance thorough the blog post we did earlier [2]. Due to these changes the
components developed for earlier version of Flume are not compatible with
the new implementation.

Regarding implementing custom sinks in Flume 1.x, it is fairly
straightforward. You create an implementation of the interface
org.apache.flume.Sink. If your implementation class is
com.exmample.custom.MySink, you can plug that into the system via the
following configuration:

agent.channels = c1
agent.sinks = s1

agent.sinks.s1.type = com.example.custom.MySink
agent.sinks.s1.channel = c1
agent.sinks.s1.sink_property = value
...

Any configuration within the agent.sinks.s1 namespace will be passed to the
configure() method implemented by your sink before it is start()ed. If the
system shutsdown, the sink will be stop()ped before that etc.

For even easier route into implementing custom sinks for Flume 1.x, just
extend out of an existing sink like the LoggerSink and override the
process() method.

Hope this helps.

Thanks,
Arvind Prabhakar

[2] https://blogs.apache.org/flume/entry/flume_ng_architecture


On Fri, May 18, 2012 at 2:04 PM, M@he$h <maheshrns@gmail.com> wrote:

> Hello Arvind,
>
> I was using flume-0.9.x version and I had everything working nicely , the
> only issue I had was tailing a specific file which is in discussion in
> another thread. The query I have is : I had my own regexAll extractor and
> hbase sink java programs, so if I upgrade to flume-NG version , can I still
> use the custom extractor and hbase sink programs with flume-NG?
>
> the flume-NG wiki
> http://archive.cloudera.com/cdh4/cdh/4/flume-ng-1.1.0-cdh4.0.0b2/FlumeUserGuide.html, does not give much explanation or samples on how to use the custom sinks.
> Could you please let me know about it?
>
> look forward for your response.
>
>
> On Fri, May 18, 2012 at 8:54 AM, Arvind Prabhakar <arvind@apache.org>wrote:
>
>> Hi Simon,
>>
>> The wiki page is dated to say the least. At the moment there are many
>> active deployments of Flume NG that are in staging if not production. I
>> encourage you to look at the performance numbers that were recently
>> published on the wiki [1].
>>
>> The usecase you have described seems something that Flume should be able
>> to handle very easily. I encourage you to look at the log4j appender,
>> Memory/File channels and the HDFS event sink. Of course you could plan on
>> using other components as well if this does not fit well with your
>> application.
>>
>> [1]
>> https://cwiki.apache.org/confluence/display/FLUME/Flume+NG+Performance+Measurements
>>
>> Thanks,
>> Arvind Prabhakar
>>
>>
>> On Fri, May 18, 2012 at 4:58 AM, Simon Kelly <simongdkelly@gmail.com>wrote:
>>
>>> Hi
>>>
>>> I'm interested in using Flume to store audit logs in HDFS which can then
>>> be queried with Hive. I see that the links on the Flume page point to Flume
>>> NG which says its not ready for production use yet. Is that still the case?
>>>
>>> Our use case would likely look something like this:
>>>
>>>    - 15 servers running a Java web server and logging audit data (1-2K
>>>    per event, 20-90 events per second per server)
>>>    - Hadoop running on 5 machine cluster (4x2.4GHz processors, 8GB RAM,
>>>    8TB total storage)
>>>
>>> Its important that all data makes it into HDFS.
>>>
>>> I'd appreciate any comments on how to proceed with this.
>>>
>>> Best regards
>>> Simon Kelly
>>>
>>
>>
>
>
> --
> *Thanks and Regards,
> *
> Mahesh
> 619-816-7011.
>

--f46d0407143be6132f04c056128c
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Hi Mahesh,<div><br></div><div>The concepts of Flume 1.x (NG) are different =
from Flume 0.9.x. For a quick premier on the changed concepts and to unders=
tand them better, please glance thorough the blog post we did earlier [2]. =
Due to these changes the components developed for earlier version of Flume =
are not compatible with the new implementation.</div>

<div><br></div><div>Regarding implementing custom sinks in Flume 1.x, it is=
 fairly straightforward. You create an implementation of the interface org.=
apache.flume.Sink. If your implementation class is com.exmample.custom.MySi=
nk, you can plug that into the system via the following configuration:</div=
>

<div><br></div><div>agent.channels =3D c1</div><div>agent.sinks =3D s1</div=
><div><br></div><div>agent.sinks.s1.type =3D com.example.custom.MySink</div=
><div>agent.sinks.s1.channel =3D c1</div><div>agent.sinks.s1.sink_property =
=3D value</div>

<div>...</div><div><br></div><div>Any configuration within the agent.sinks.=
s1 namespace will be passed to the configure() method implemented by your s=
ink before it is start()ed. If the system shutsdown, the sink will be stop(=
)ped before that etc.</div>

<div><br></div><div>For even easier route into implementing custom sinks fo=
r Flume 1.x, just extend out of an existing sink like the LoggerSink and ov=
erride the process() method.</div><div><br></div><div>Hope this helps.</div=
>

<div><br></div><div>Thanks,</div><div>Arvind Prabhakar</div><div><br></div>=
<div>[2]=A0<a href=3D"https://blogs.apache.org/flume/entry/flume_ng_archite=
cture">https://blogs.apache.org/flume/entry/flume_ng_architecture</a></div>

<div><br></div><div><br></div><div><br></div><div><br><br><div class=3D"gma=
il_quote">On Fri, May 18, 2012 at 2:04 PM, M@he$h <span dir=3D"ltr">&lt;<a =
href=3D"mailto:maheshrns@gmail.com" target=3D"_blank">maheshrns@gmail.com</=
a>&gt;</span> wrote:<br>

<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">Hello Arvind,<br><br>I was using flume-0.9.x=
 version and I had everything working nicely , the only issue I had was tai=
ling a specific file which is in discussion in another thread. The query I =
have is : I had my own regexAll extractor and hbase sink java programs, so =
if I upgrade to flume-NG version , can I still use the custom extractor and=
 hbase sink programs with flume-NG?<br>


<br>the flume-NG wiki <a href=3D"http://archive.cloudera.com/cdh4/cdh/4/flu=
me-ng-1.1.0-cdh4.0.0b2/FlumeUserGuide.html" target=3D"_blank">http://archiv=
e.cloudera.com/cdh4/cdh/4/flume-ng-1.1.0-cdh4.0.0b2/FlumeUserGuide.html</a>=
 , does not give much explanation or samples on how to use the custom sinks=
. Could you please let me know about it? <br>


<br>look forward for your response.<div class=3D"HOEnZb"><div class=3D"h5">=
<br><br><div class=3D"gmail_quote">On Fri, May 18, 2012 at 8:54 AM, Arvind =
Prabhakar <span dir=3D"ltr">&lt;<a href=3D"mailto:arvind@apache.org" target=
=3D"_blank">arvind@apache.org</a>&gt;</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">Hi Simon,<div><br></div><div>The wiki page i=
s dated to say the least. At the moment there are many active deployments o=
f Flume NG that are in staging if not production. I encourage you to look a=
t the performance numbers that were recently published on the wiki [1].</di=
v>


<div><br></div><div>The usecase you have described seems something that Flu=
me should be able to handle very easily. I encourage you to look at the log=
4j appender, Memory/File=A0channels=A0and the HDFS event sink. Of course yo=
u could plan on using other components as well if this does not fit well wi=
th your application.</div>


<div><br></div><div>[1]=A0<a href=3D"https://cwiki.apache.org/confluence/di=
splay/FLUME/Flume+NG+Performance+Measurements" target=3D"_blank">https://cw=
iki.apache.org/confluence/display/FLUME/Flume+NG+Performance+Measurements</=
a><br>


<br>Thanks,</div>

<div>Arvind Prabhakar<div><div><br><br><div class=3D"gmail_quote">On Fri, M=
ay 18, 2012 at 4:58 AM, Simon Kelly <span dir=3D"ltr">&lt;<a href=3D"mailto=
:simongdkelly@gmail.com" target=3D"_blank">simongdkelly@gmail.com</a>&gt;</=
span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><span>Hi</span><div><br></div><div>I&#39;m i=
nterested in using Flume to store audit logs in HDFS which can then be quer=
ied with Hive. I see that the links on the Flume page point to Flume NG whi=
ch says its not ready for production use yet. Is that still the case?</div>


<div><br></div><div>Our use case would likely look something like this:</di=
v><div><ul><li style=3D"margin-left:15px">15 servers running a Java web ser=
ver and logging audit data (1-2K per event, 20-90 events per second per ser=
ver)</li>


<li style=3D"margin-left:15px">Hadoop running on 5 machine cluster (4x2.4GH=
z processors, 8GB RAM, 8TB total storage)</li></ul><div>Its important that =
all data makes it into HDFS.</div></div><div><br></div><div>
I&#39;d appreciate any comments on how to proceed with this.</div>
<div><br></div><div>Best regards</div><span><font color=3D"#888888"><div>Si=
mon Kelly</div>
</font></span></blockquote></div><br></div></div></div>
</blockquote></div><br><br clear=3D"all"><br></div></div><span class=3D"HOE=
nZb"><font color=3D"#888888">-- <br><font style color=3D"#888888"><b><span =
style=3D"color:rgb(102,102,102)">Thanks and Regards,</span><br></b><br>Mahe=
sh<br>

<a href=3D"tel:619-816-7011" value=3D"+16198167011" target=3D"_blank">619-8=
16-7011</a>.</font><br>
</font></span></blockquote></div><br></div>

--f46d0407143be6132f04c056128c--