hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron T. Myers (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4213) When the client calls hsync, allows the client to update the file length in the NameNode
Date Wed, 28 Nov 2012 01:20:58 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13505137#comment-13505137
] 

Aaron T. Myers commented on HDFS-4213:
--------------------------------------

Hey Jing, I don't think this patch actually works. On the line in DFSClient#flushOrSync where
you set the value of lastBlockLength equal to {{this.streamer.block.getNumBytes()}}, {{this.streamer.block}}
may be null. When I apply the patch to trunk and run TestHFlush, I noticed that the test actually
prints a stack trace caused by an NPE in the DFSOutputStream. Unfortunately the test appears
to pass in spite of this because TestHFlush#doTheJob wraps everything in a try/catch with
the following catch block:
{code}
} catch (Exception e) {                                                                  
        |    
  e.printStackTrace();
}
{code}

This will obviously need to be fixed before this can be committed. :)

I also have a few other little comments on the code:

# Is this expected to ever legitimately happen from a client's perspective? Or would it just
indicate a bug in HDFS? If the latter, we should probably change this to an AssertionError:
{code}
if (lastBlock == null) {
  throw new IOException("The last block for path " + this.getFullPathName()
      + " is null when updating its length");
}
{code}
# Similarly, it seems that this case should only occur in the case of either an HDFS bug.
If so, we should probably change this to an AssertionError:
{code}
if (!(lastBlock instanceof BlockInfoUnderConstruction)) {
  throw new IOException("The last block for path " + this.getFullPathName()
      + " is not a BlockInfoUnderConstruction when updating its length");
}
{code}
# I recommend you rename the enum value "SYNC_UPDATELENGTH" to just "UPDATE_LENGTH". That
it's referring to sync should already be clear from the fact that it's a member of the SyncType
enum.

Regarding the interface, I'm a little concerned with how the introduction of this new hsync()
method to HDFS relates to the hsync() methods in various classes in Common. In particular,
most clients that I'm aware of don't actually directly use HdfsDataOutputStream, but instead
use FSDataOutputStream and call {{hsync()}} directly on that. However, this patch doesn't
add an {{hsync(EnumSet<SyncType>)}} method to the FSDataOutputStream class or the Syncable
interface. I think we should consider doing so, and if we do, then we would need to move the
SyncType enum out of HDFS and into Common. What do you think?
                
> When the client calls hsync, allows the client to update the file length in the NameNode
> ----------------------------------------------------------------------------------------
>
>                 Key: HDFS-4213
>                 URL: https://issues.apache.org/jira/browse/HDFS-4213
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>    Affects Versions: 3.0.0
>            Reporter: Jing Zhao
>            Assignee: Jing Zhao
>         Attachments: HDFS-4213.001.patch, HDFS-4213.002.patch, HDFS-4213.003.patch, HDFS-4213.004.patch,
HDFS-4213.005.patch, HDFS-4213.006.patch
>
>
> As per discussion in HDFS-3960 and HDFS-2802, when clients that need strong consistency
update the file length at the NameNode, a special sync/flush is required for getting the length
of the being written files when snapshots are taken for these files. This jira implements
this sync-with-updating-length by 1) calling ClientProtocol#fsync(), and 2) adding a new field
to ClientProtocol#fsync() to indicate the length information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message