chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jiaqi Tan (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CHUKWA-349) State-machine generation across split files
Date Tue, 14 Jul 2009 22:49:14 GMT

    [ https://issues.apache.org/jira/browse/CHUKWA-349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12731179#action_12731179
] 

Jiaqi Tan commented on CHUKWA-349:
----------------------------------

This won't be a problem when a user tries to visualize an on-going job. That's fine. The problem
is that if you have, say, an ongoing map, whose start MapAttempt is in the current Demux boundary,
and the end MapAttempt is in the next Demux, that map will be completely dropped. 

In the current scheme of things, if the map (or reduce) isn't complete, you won't even see
it in the Swimlanes visualization. Supporting visualizing in-progress tasks will require a
complete rewrite and was never the intention.

> State-machine generation across split files
> -------------------------------------------
>
>                 Key: CHUKWA-349
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-349
>             Project: Hadoop Chukwa
>          Issue Type: Improvement
>          Components: Data Processors
>    Affects Versions: 0.3.0
>            Reporter: Jiaqi Tan
>            Assignee: Jiaqi Tan
>             Fix For: 0.3.0
>
>
> Current SALSA state-machine generation assumes input files contain all starts and ends
of all states; this may not be the case if the input data is sliced across Demux boundaries.
There is a need to track incomplete data across multiple runs of the FSMBuilder and to expire
and purge state as it's kept past a certain duration. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message