chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jerome Boulon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CHUKWA-155) Job History status arrive out of order causing the status to update incorrectly.
Date Mon, 20 Apr 2009 21:18:47 GMT

    [ https://issues.apache.org/jira/browse/CHUKWA-155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12700938#action_12700938
] 

Jerome Boulon commented on CHUKWA-155:
--------------------------------------

+1 on asking Hadoop team to add time stamp since we want to do some time based analytic.

Demux is able to deal with any kind of data but if  there's some rules. 
It's the parser responsibility to provide
- provide a time stamp, if any, use the default one provided by the Collector at the Chunk
level
- a key that will group information together according to the data usage

Regarding the case where the data does not contain any time stamp the system will do a best
effort to partition the data based on collector time stamp but the parser could/should guarantee
the order by specifying a key that contains the SeqId + offset within the same chunk.



> Job History status arrive out of order causing the status to update incorrectly.
> --------------------------------------------------------------------------------
>
>                 Key: CHUKWA-155
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-155
>             Project: Hadoop Chukwa
>          Issue Type: Bug
>          Components: data collection, Data Processors
>         Environment: Redhat 5.1, Java 6
>            Reporter: Eric Yang
>            Assignee: Jerome Boulon
>            Priority: Critical
>
> Job history contains lines like:
> Job JOBID="job_200903310541_1747" JOB_STATUS="RUNNING" .
> ...
> Job JOBID="job_200903310541_1747" FINISH_TIME="1238542231308" JOB_STATUS="SUCCESS" FINISHED_MAPS="1338"
FINISHED_REDUCES="760" FAILED_MAPS="78" FAILED_REDUCES="43" COUNTERS="..." .
> When pushing the data through collectors and demux, the data can arrive out of order.
 The database is updated with status "RUNNING" instead of "SUCCESS".  
> Chukwa Sequence ID can be used to sort out of order data before the data is pumped to
database.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message