chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ari Rabkin (JIRA)" <>
Subject [jira] Commented: (CHUKWA-311) Re-implement Hourly & Daily rolling
Date Fri, 19 Jun 2009 20:48:07 GMT


Ari Rabkin commented on CHUKWA-311:

Are we talking only about rolling the post-demux records, or also the raw chunks?
Being able to archive chunks is a fairly high priority for me.  See CHUKWA-317.

> Re-implement Hourly & Daily rolling
> -----------------------------------
>                 Key: CHUKWA-311
>                 URL:
>             Project: Hadoop Chukwa
>          Issue Type: Improvement
>            Reporter: Jerome Boulon
> Hourly and Daily rolling are currently done using a M/R but all spill files are already
sorted so it's just a Merged sort.
> Doing that from a standalone application will be more efficient than using a M/R.
> Another way to implement this will be to take advantage of the latest version of Pig
(multiple queries optimization) and do the rolling once a day at the same time as we are computing
daily metrics (Since the data has already been loaded by pig).

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message