hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daryn Sharp (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2034) length in getBlockRange becomes -ve when reading only from currently being written blk
Date Mon, 06 Jun 2011 18:26:59 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13045003#comment-13045003

Daryn Sharp commented on HDFS-2034:

Looking good!

bq. The reason I am hesitant about adding an assert for reading past the last block is because
it has to go out of its way to calculate the length of all the blocks (which is already being
done by the caller of this function) and that does not seem to be the job of this function.

Sure, I agree that it's usually ok to omit sanity checks in private methods.  Especially if
the method will blow up with an exception if the arg is insane.  Ex. not checking for nulls.
 In this case, omitting a sanity check of the offset creates another latent defect similar
to ones being fixed.  Let's imagine that in the future a caller passes an offset beyond the
length of the file.  The caller will incorrectly get the last block, and some poor dev will
have to debug it.

I took a peek, and only 1 assert is needed, and it's really cheap because all the values are
pre-computed and it's just a couple field accesses.  {{DFSInputStream}}'s {{getFileLength()}}
returns the full size of the located blocks and the bytes in the last unwritten block.  Perfect.
 The first line of the method could be:

{{assert(offset < getFileLength(), "offset exceeds file length");}}

I'll defer to Todd if wishes to contradict my opinion.

> length in getBlockRange becomes -ve when reading only from currently being written blk
> --------------------------------------------------------------------------------------
>                 Key: HDFS-2034
>                 URL: https://issues.apache.org/jira/browse/HDFS-2034
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: John George
>            Assignee: John George
>            Priority: Minor
>         Attachments: HDFS-2034-1.patch, HDFS-2034-1.patch, HDFS-2034.patch
> This came up during HDFS-1907. Posting an example that Todd posted in HDFS-1907 that
brought out this issue.
> {quote}
> Here's an example sequence to describe what I mean:
> 1. open file, write one and a half blocks
> 2. call hflush
> 3. another reader asks for the first byte of the second block
> {quote}
> In this case since offset is greater than the completed block length, the math in getBlockRange()
of DFSInputStreamer.java will set "length" to negative.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message