hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-877) Client-driven checksum verification not functioning
Date Thu, 07 Jan 2010 21:41:16 GMT

    [ https://issues.apache.org/jira/browse/HDFS-877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797796#action_12797796

Todd Lipcon commented on HDFS-877:

This may turn out to be reasonably tricky to solve. The issue is that the packet with lastPacketInBlock=true
comes in an empty packet after the data has been read. Consider the following scenario:

# Block is exactly N bytes
# Client determines (or knows) the file length and thus reads exactly up to byte N, but not
past. This is the case for MapReduce jobs when an inputsplit doesn't cross block boundaries
(eg any input file <1block)
# In this case, the server will still send the empty "lastPacketInBlock" packet, but the client
will never read it (since it doesn't read ahead in any way)

Point 2 above is currently being enforced by DFSInputStream, since it calls getFileLength()
before passing a read() call down into the BlockReader.

A couple things to investigate:
# Is the check currently done by DFSInputStream important for limiting the length visible
to a reader for an in-progress block? Or can that limit be satisfied by passing only the visible
length to the OP_READ_BLOCK call? If the length limitation can be ignored in the DFSInputStream
layer, I think that would solve the issue fairly trivially.
# Alternatively, can we invert BlockReader.readChunk so that it reads ahead a packet? That
is to say, if after a read, the internal buffer is emptied, can we read the *next* packet
at this point? I don't really like this solution...

> Client-driven checksum verification not functioning
> ---------------------------------------------------
>                 Key: HDFS-877
>                 URL: https://issues.apache.org/jira/browse/HDFS-877
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 0.20.1, 0.21.0, 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
> This is actually the reason for HDFS-734 (TestDatanodeBlockScanner timing out). The issue
is that DFSInputStream relies on readChunk being called one last time at the end of the file
in order to receive the lastPacketInBlock=true packet from the DN. However, DFSInputStream.read
checks pos < getFileLength() before issuing the read. Thus gotEOS never shifts to true
and checksumOk() is never called.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message