hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kevin Beyer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-196) File length not reported correctly after application crash
Date Wed, 10 Jun 2015 06:16:01 GMT

    [ https://issues.apache.org/jira/browse/HDFS-196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14580084#comment-14580084
] 

Kevin Beyer commented on HDFS-196:
----------------------------------

I've learned about soft and hard limits on the update lease.  After the hard limit expired,
the file length corrected to same number of bytes found by reading.  So this is not a bug.


However, I have a few ideas that might help:

1. The file stats could update when the soft limit expires.  This would reduce the window
of inconsistency to 1 minute instead of 1 hour.

2. Allow the writing application to control the "safe" length and limit readers to the safe
length.  Readers could set an option to read the unsafe bytes (or the default could read the
full length, but that seems more dangerous although backwards compatible to current behavior).
 If the lease is not recovered before the hard limit expires, the unsafe bytes are discarded
(a writer option could control this as well).  This would allow applications to avoid partial
record reads. 

3. A simple way for readers to detect that there is an active soft/hard lease on a file, probably
in the FileStatus.

4. The hard limit duration should be an option when opening for write.  The default should
be zero.

5. A simple way to terminate a hard lease.


> File length not reported correctly after application crash
> ----------------------------------------------------------
>
>                 Key: HDFS-196
>                 URL: https://issues.apache.org/jira/browse/HDFS-196
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Doug Judd
>
> Our application (Hypertable) creates a transaction log in HDFS.  This log is written
with the following pattern:
> out_stream.write(header, 0, 7);
> out_stream.sync()
> out_stream.write(data, 0, amount);
> out_stream.sync()
> [...]
> However, if the application crashes and then comes back up again, the following statement
> length = mFilesystem.getFileStatus(new Path(fileName)).getLen();
> returns the wrong length.  Apparently this is because this method fetches length information
from the NameNode which is stale.  Ideally, a call to getFileStatus() would return the accurate
file length by fetching the size of the last block from the primary datanode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message