hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hairong Kuang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-732) HDFS files are ending up truncated
Date Wed, 28 Oct 2009 00:27:59 GMT

    [ https://issues.apache.org/jira/browse/HDFS-732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12770722#action_12770722

Hairong Kuang commented on HDFS-732:

First of all, dfs supports non-blocking writes. Although an application may successful writes
18654752 byte and returns, the bytes may still be buffered at the client side and have not
been pushed the datanodes in the pipeline yet. 

>From the logs that you provided, it seemed to me that the packet that starts with the
byte  17825793 was not successfully pushed to all datanodes. For some reason, three datanodes
failed in a row. The dfs client tried to resend the packet twice as a result the generation
stamp of the block was bumped from 76799972 to 76840998, then to 76840999 and replica length
was truncated to  17825792. Eventually the dfs client failed with an error: "All datanodes
xxx.yyy.zzz.44:uuu10 are bad. Aborting..". Afterwards NN tried to recover this un-closed file.
Since the only valid replica at xxx.yyy.zzz.44:uuu10  had 17825792 bytes, that's why  the
block ended up with 17825792 bytes.

Basically dfs does not provide any guarantee on the file length if a dfs client goes away
and the file is left unclosed. But in 0.21, if an application calls hflush(), then dfs guarantee
that hflush bytes will not be truncated on error recovery.

> HDFS files are ending up truncated
> ----------------------------------
>                 Key: HDFS-732
>                 URL: https://issues.apache.org/jira/browse/HDFS-732
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.20.1
>            Reporter: Christian Kunz
> We recently started to use hadoop-0.20.1 in our production environment (less than 2 weeks
ago) and already had 3 instances of truncated files, more than we had for months using hadoop-0.18.3.
> Writing is done using libhdfs, although it rather seems to be a problem on the server
> I will post some relevant logs (they are too large to be put into the description)

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message