chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ari Rabkin (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CHUKWA-311) Re-implement Hourly & Daily rolling
Date Fri, 19 Jun 2009 20:48:07 GMT

    [ https://issues.apache.org/jira/browse/CHUKWA-311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721998#action_12721998
] 

Ari Rabkin commented on CHUKWA-311:
-----------------------------------

Are we talking only about rolling the post-demux records, or also the raw chunks?
Being able to archive chunks is a fairly high priority for me.  See CHUKWA-317.

> Re-implement Hourly & Daily rolling
> -----------------------------------
>
>                 Key: CHUKWA-311
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-311
>             Project: Hadoop Chukwa
>          Issue Type: Improvement
>            Reporter: Jerome Boulon
>
> Hourly and Daily rolling are currently done using a M/R but all spill files are already
sorted so it's just a Merged sort.
> Doing that from a standalone application will be more efficient than using a M/R.
> Another way to implement this will be to take advantage of the latest version of Pig
(multiple queries optimization) and do the rolling once a day at the same time as we are computing
daily metrics (Since the data has already been loaded by pig).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message