chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "T. A. Smooth" <catdaaa...@gmail.com>
Subject Re: Agent and collector
Date Sat, 30 Jul 2011 05:36:47 GMT
Thanks for the feed back.  We will have to try out the end to end features.

I also read: http://www.usenix.org/events/lisa10/tech/full_papers/Rabkin.pdf
This is some good informative stuff.

I have a follow up question:

Concerning question 2 about the load balancer(lb): It seems with end-to-end
features agent&lbs may have trouble with it.  If each agent talks to the
collector to get an update to see how far along the file has been saved to
the datastore that could be problematic for a lb setup.
Because in that case, an agent will be sending a chunk of data to the
collector "thinking" it is just one collector. But in reality it will be
sending to many collectors. And each chunk will be going to different files
on different collectors.

When the agent polls the collector(load balancer) about the length of the
file that it has been writing to the datastore it seems the collector would
not handle that situation very well. The request could get round robin'd or
randomly sent to a collector on the backend that may have never written a
file for the agent or the collector may only have written a portion of the
file the agent wants to know about. So one poll the agent would get one set
of file length info from a collector and another poll it may get another set
of info from a different collector.

This could be controlled some if we did connection persistence or connection
stickiness on the lb side. So each agent would be stuck to a collector on
the backend unless there is some availability issue with a collector. What
other issues do you foresee?

The main reason I love the lb idea, in theory, is because we don't have to
update any of the agents when we decide to add collectors to the cluster. We
just have to update one place, the lb.  In the default setup , when we add a
collector we would have to update the config files on all the agents and
have them reread this collector list. For a few boxes this isn't a big deal.
But not so, if you have a ton of boxes.

Thanks for everyone times! For real.

-tp-


 Every few minutes, each agent process polls a collector to find the length
>> of each file to which data is being
>
> written. The length of the file is then compared with the
>
> offset at which each chunk was to be written. If the file
>
> length exceeds this value, then the data has been committed and the agent
>> process advances its checkpoint accordingly. (Note that the length returned
>> by the filesystem
>
> is the amount of data that has been successfully replicated.)
>
>
On Fri, Jul 29, 2011 at 4:37 PM, Eric Yang <eric818@gmail.com> wrote:

> Hi Tp,
>
> 1) Yes, chukwa communicate over http.  By default, collector listens to
> port 8080.
>
> 2) If agent only has one collect defined in it's collector list.  It will
> retry the same collector after a few second pause.
>
> 3) There are 2 additional features for improving end-to-end reliability.
>  In Chukwa collector, you can turn on httpConnector.asyncAcks=true.  This
> will ensure Agent resend data if the data has not been committed.  A second
> method is to use localWriter to buffer the data on local disk of the
> collector and periodically upload the data to HDFS.  Both options can be
> configured in chukwa-collector-conf.xml.
>
> Hope this helps.
>
> regards,
> Eric
>
> On Jul 29, 2011, at 11:04 AM, T. A. Smooth wrote:
>
> > Hello I am checking out Chukwa. I have a few questions I was hoping the
> mail list could answer :-)
> >
> > 1)Does Chukwa agents communicate to collectors over http? Or some other
> protocol?
> >
> > The agent configuration makes me believe that:
> http://incubator.apache.org/chukwa/docs/r0.4.0/admin.html#Configuration
> >
> > 2) And the docs it seems an Agent will pick a collector at random and
> then use that collect until there is a problem in communicating with it. How
> do you think the agent/collector would act if they have a load balancer
> between them? For example, the agent configuration would have just one url
> http://collector-loadbalancer. example.com:8080/
> >
> > The load balancer would have 1 or more collectors behind it saving the
> chunks it receives to disk or hadoop.
> >
> > 3) Does chukwa have any “end-to-end” reliability features for message
> delivery? For example, a collector may receive the chunk from the agent but
> it may have a problem writing it to the data store. (ie. Disk space full,
> connection to hadoop down) . Will the agent be notified that the chunk was
> not processed for a certain reason and the agent is told to cache to disk
> the missed message?
> >
> > Thanks for the info!
> >
> > -tp-
> >
>
>


-- 
*Splat*! <http://goog_843711221>
<http://www.webeclubbin.com/blog/2011/05/typoe-%E2%80%93-confetti-death-hypebeast/>

@CatDaaaady <https://twitter.com/#!/CatDaaaady>

Mime
View raw message