Return-Path: X-Original-To: apmail-flume-user-archive@www.apache.org Delivered-To: apmail-flume-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C9C39E2B6 for ; Thu, 10 Jan 2013 08:10:51 +0000 (UTC) Received: (qmail 6320 invoked by uid 500); 10 Jan 2013 08:10:50 -0000 Delivered-To: apmail-flume-user-archive@flume.apache.org Received: (qmail 6241 invoked by uid 500); 10 Jan 2013 08:10:50 -0000 Mailing-List: contact user-help@flume.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flume.apache.org Delivered-To: mailing list user@flume.apache.org Received: (qmail 6224 invoked by uid 99); 10 Jan 2013 08:10:50 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Jan 2013 08:10:50 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of cwoodson.dev@gmail.com designates 74.125.82.173 as permitted sender) Received: from [74.125.82.173] (HELO mail-we0-f173.google.com) (74.125.82.173) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Jan 2013 08:10:43 +0000 Received: by mail-we0-f173.google.com with SMTP id z2so107280wey.18 for ; Thu, 10 Jan 2013 00:10:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=8+EUNgs/obRf3EnqJ8LYwf+auggJClqr1Z1W7CKWIxI=; b=VfsyS6haAe4xZW8PtyyYn3mXsJwY6GXbu/r2wacR2pP/NVF+ywxCqkjh42cQ64otJ3 LGuqohUL6faJfsjJPu/P4uJVzxiAk/cZF35p9xhes7UPg4ztACOr0htQuenLGazjcb0N EyOtyCWVtEs4zlvf8+wwxQtWbTvXbDnDy3gv+bk18SN85QaEAE11JjtK15yNd1EVEuiE cWjwkrCGX7VFROh+0qW1LXGrdjF2EnKgcD90AToqbvytaaK/25FgCWU/QGdOL1EsZpI7 VC49Qo7dhaofICv5CTTzQSphxJ6+SMrcRBep9p9lDU5kgxMKgEjU+JWNKSu1laPR0mEK 987Q== MIME-Version: 1.0 Received: by 10.194.92.180 with SMTP id cn20mr113001129wjb.51.1357805423143; Thu, 10 Jan 2013 00:10:23 -0800 (PST) Received: by 10.227.2.196 with HTTP; Thu, 10 Jan 2013 00:10:23 -0800 (PST) In-Reply-To: References: <7C203CBE0393440DAE62FA902990E4BF@cloudera.com> Date: Thu, 10 Jan 2013 00:10:23 -0800 Message-ID: Subject: Re: AvroSink and LoadBalancingRpcClient From: Connor Woodson To: user@flume.apache.org Content-Type: multipart/alternative; boundary=047d7bb70d92ebab0304d2eab5c6 X-Virus-Checked: Checked by ClamAV on apache.org --047d7bb70d92ebab0304d2eab5c6 Content-Type: text/plain; charset=ISO-8859-1 Forgot about sink processors; yes, it will work. The trick of this method is you will use a different sink for each endpoint, where as the RpcClient (when exposed) will do it all in itself. Your configuration will need to look something like this: ----------------- a1.channels = c1 a1.sinks = k1 k2 a1.sinks.k1.type = AVRO < set up centralFlumeE connection > a1.sinks.k1.channel = c1 a1.sinks.k2.type = AVRO < set up centralFlumeF connection > a1.sinks.k2.channel = c1 a1.sinkgroups = g1 a1.sinkgroups.g1.sinks = k1 k2 a1.sinkgroups.g1.processor.type = load_balance a1.sinkgroups.g1.processor.backoff = true a1.sinkgroups.g1.processor.selector = round_robin ----------------- here is the relevant link for the load balancing processor: http://flume.apache.org/FlumeUserGuide.html#load-balancing-sink-processor Remember that all sinks in a sink group must share the same channel. This is load balancing, which is what you are seeking in your scenario; the load balancer is not for failover (in the setup of primary and backup servers), although there is a FailoverSinkProcessor for if that's needed. - Connor On Wed, Jan 9, 2013 at 11:55 PM, Denny Ye wrote: > hi Hari, > I cannot judge the situation that using method you raised. I would > like to explain my case and need your comments. Thanks a lot! > What I need is load balancing while event transferring. Assume that I > have single local Flume server (located with application) named > 'localFlumeA', configured with single AvroSink and Channel. Meanwhile, two > central Flume servers (collectors) named 'centralFlumeE' and > 'centralFlumeF'. Under this case, I would like to configure load balancing > between 'centralFlumeE' and 'centralFlumeF' for events coming from > 'localFlumeA', and load can be dispatched averagely for that two central > Flume servers. > Can it be configured by LoadBalancingSinkProcessor in your mind? Wish > your advice > > -Regards > Denny Ye > > > 2013/1/10 Hari Shreedharan > >> The LoadBalancing capability similar to the LoadBalancingRpcClient can >> be configured for multiple Avro Sinks using a LoadBalancingSinkProcessor, >> if you are looking for that functionality. >> >> >> Hari >> >> -- >> Hari Shreedharan >> >> On Wednesday, January 9, 2013 at 11:05 PM, Connor Woodson wrote: >> >> Short answer: there is no way in the current AvroSink to configure the >> RpcClient, limiting you to just a single host connection (I'm not sure how >> well it recovers if that host goes down). >> >> The AvroSink is incredibly simplified from what the RPCClient can do and >> exposes none of the background functionality. Right now, the only way >> around that is to create a custom sink based off of the AvroSink source >> code and instead of setting the RPCClient up the way it currently is, you >> pass into the RPCClient.getInstance() a set of user supplied properties. To >> implement this in an unsafe way (not checking any of the user's values) >> would only take a couple lines of code I believe. It is a work around, but >> it will enable all of the various RPCClient capabilities such as failover >> or loadbalancing mode and allow it to connect to multiple hosts. >> >> This is something that (I think) there is a JIRA filed for; but if not, >> it would be very helpful for this to be implemented into the actual >> AvroSink (and something that should be linked to that is >> RPCClient.getInstance accepting a Context object, simply for ease of use). >> >> - Connor >> >> >> On Wed, Jan 9, 2013 at 10:55 PM, Denny Ye wrote: >> >> hi all, >> I didn't find the relationship between AvroSink and other types of >> RpcClient, including LoadBalancingRpcClient. In my opinion, user can set >> the specified RpcClient type from AvroSink with several strategies and host >> selectors. Also, I cannot get information from source code and user guide. >> Did I miss something about this? >> Wish someone can support, thanks! >> >> -Regards >> Denny Ye >> >> >> >> > --047d7bb70d92ebab0304d2eab5c6 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Forgot about sink processors; yes, it will work.

<= /div>
The trick of this method is you will use a different sink f= or each endpoint, where as the RpcClient (when exposed) will do it all in i= tself. Your configuration will need to look something like this:

-----------------

<sources>

a1.channels =3D c1
<= channel setup>

a1.sinks =3D k1 k2

a1.sinks.k1.type =3D AVRO
< set up centralFlu= meE connection >
a1.sinks.k1.channel =3D c1

a1.sinks.k2.type =3D AVRO
< set= up centralFlumeF connection >
a1.sinks.k2.channel =3D c1

a1.sinkgroups =3D g1
a1.si= nkgroups.g1.sinks =3D k1 k2
a1.sinkgroups.g1.processor.type =3D load_bal= ance
a1.sinkgroups.g1.processor.backoff =3D true
a1.sinkgroups.g1.pro= cessor.selector =3D round_robin

-----------------

here i= s the relevant link for the load balancing processor:=A0http://f= lume.apache.org/FlumeUserGuide.html#load-balancing-sink-processor

Remember that all sinks in a sink group mus= t share the same channel. This is load balancing, which is what you are see= king in your scenario; the load balancer is not for failover (in the setup = of primary and backup servers), although there is a FailoverSinkProcessor f= or if that's needed.

- Connor


On Wed, Jan 9, 2013 at 11:55 PM, Den= ny Ye <dennyy99@gmail.com> wrote:
hi Hari,=A0
=A0 =A0 I c= annot judge the situation that using method you raised.=A0I would like to e= xplain my case and need your comments. Thanks a lot!
=A0 =A0 What I need is load balancing while event transferring. =A0Ass= ume that I have single local Flume server (located with application) named = 'localFlumeA', configured with single AvroSink and Channel. Meanwhi= le, two central Flume servers (collectors) named 'centralFlumeE' an= d 'centralFlumeF'. Under this case, I would like to configure load = balancing between 'centralFlumeE' and 'centralFlumeF' for e= vents coming from 'localFlumeA', and load can be dispatched average= ly for that two central Flume servers.=A0
=A0 =A0 Can it be configured by LoadBalancingSinkProcessor in your min= d? Wish your advice

-Regards
Denny Ye


2013/1/10 Hari Shreedharan <hshreedharan@cloudera.com>
The LoadBalancing capability similar to the LoadBalanci= ngRpcClient can be configured for multiple Avro Sinks using a LoadBalancing= SinkProcessor, if you are looking for that functionality.

Hari

--=A0
Hari Shreedharan

=20

On Wednesday, January 9, 2013 at= 11:05 PM, Connor Woodson wrote:

Short answer: the= re is no way in the current AvroSink to configure the RpcClient, limiting y= ou to just a single host connection (I'm not sure how well it recovers = if that host goes down).

The AvroSink is incredibly simplified from what the RPCClien= t can do and exposes none of the background functionality. Right now, the o= nly way around that is to create a custom sink based off of the AvroSink so= urce code and instead of setting the RPCClient up the way it currently is, = you pass into the RPCClient.getInstance() a set of user supplied properties= . To implement this in an unsafe way (not checking any of the user's va= lues) would only take a couple lines of code I believe. It is a work around= , but it will enable all of the various RPCClient capabilities such as fail= over or loadbalancing mode and allow it to connect to multiple hosts.

This is something that (I think) there is a JIRA filed for; = but if not, it would be very helpful for this to be implemented into the ac= tual AvroSink (and something that should be linked to that is RPCClient.get= Instance accepting a Context object, simply for ease of use).

- Connor


On Wed, Jan 9, 20= 13 at 10:55 PM, Denny Ye <dennyy99@gmail.com> wrote:
hi all,=A0
=A0 =A0 I didn't find the relation= ship between AvroSink and other types of RpcClient, including LoadBalancing= RpcClient. In my opinion, user can set the specified RpcClient type from Av= roSink with several strategies and host selectors. Also, I cannot get infor= mation from source code and user guide. Did I miss something about this?=A0=
=A0 =A0 =A0Wish someone can support, thanks!

= -Regards
Denny Ye

=20 =20 =20 =20
=20



--047d7bb70d92ebab0304d2eab5c6--