incubator-chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ariel Rabkin <>
Subject Re: speeding up demux?
Date Mon, 10 May 2010 23:39:07 GMT
Can you say a bit about where your bottleneck is?  Is there one reduce
that's taking a very long time? Can you check the logs and see which
datatype that reducer is dealing with?  There was some discussion of
this on JIRA recently; consensus is that our current partitioner works
well if you have a wide variety of datatypes, none of which is too
big, and badly if you have one or two datatypes with lots of data in

On Mon, May 10, 2010 at 3:07 PM, Corbin Hoenes <> wrote:
> Is it possible to tune the time or size interval on demux to lower the amount of time
it takes to get demuxed data into the hadoop cluster?
> (Or some other way?)   Currently there is about a 20-30 minute lag on our setup.  Wondering
also if this a wise thing to even try--maybe some side effects?

Ari Rabkin
UC Berkeley Computer Science Department

View raw message