chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jiaqi Tan (JIRA)" <>
Subject [jira] Commented: (CHUKWA-410) Does the BackfillingLoader return only after HDFS blocks are committed?
Date Fri, 06 Nov 2009 04:44:32 GMT


Jiaqi Tan commented on CHUKWA-410:

 > What do you mean by: "the raw log files are complete"?
 >  --> the datasink file from the collector is complete? 

Actually, let me check on that. I was just wondering if the semantics of WAIT_TILL_FINISHED
could result in any races, i.e. blocks closed without the file being fully written, and the
Demux hitting an incomplete file and processing only the blocks that had been closed so far.

> Does the BackfillingLoader return only after HDFS blocks are committed?
> -----------------------------------------------------------------------
>                 Key: CHUKWA-410
>                 URL:
>             Project: Hadoop Chukwa
>          Issue Type: Bug
>          Components: data collection
>    Affects Versions: 0.3.0
>         Environment: Hadoop 0.20.0, Debian 4 (Etch), Chukwa rev 817532
>            Reporter: Jiaqi Tan
> I see that the BackfillingLoader is set to AdaptorShutdownPolicy.WAIT_TILL_FINISHED,
what are the semantics of this? Does this mean that the BackfillingLoader returns after the
last HDFS write request is made, but the DFSClient could continue to be flushing blocks to
the DataNodes in the background? Or does that mean that the entire file has been written/flushed
to HDFS and closed and fully available?
> I'm running the Demux immediately after the BackfillingLoader is complete; the raw log
files are complete, but the Demux picks up only half of the entries in those log files. Could
this be because some blocks are not closed yet?

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message