hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hairong Kuang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file
Date Thu, 22 Apr 2010 22:56:51 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12860058#action_12860058

Hairong Kuang commented on HDFS-1057:

In the trunk I think there is a more elegant way of solving the problem.

In each ReplcaBeingWritten, we could have two more fields to keep track of the last consistent
state: replica length and the last chunk's crc. The last consistent state is defined as the
replica length right after the most recent packet has been flushed to the disk. So when a
read request comes in, if the last byte requested falls into a being-written last chunk, the
datanode returns the last chunk upto the last consistent state. The crc is read from memory
in stead of reading from the disk. 

This will also make the replica recovery problem filed in HDFS-1103 easier to solve.

> Concurrent readers hit ChecksumExceptions if following a writer to very end of file
> -----------------------------------------------------------------------------------
>                 Key: HDFS-1057
>                 URL: https://issues.apache.org/jira/browse/HDFS-1057
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: data-node
>    Affects Versions: 0.21.0, 0.22.0
>            Reporter: Todd Lipcon
>            Priority: Critical
> In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before calling flush().
Therefore, if there is a concurrent reader, it's possible to race here - the reader will see
the new length while those bytes are still in the buffers of BlockReceiver. Thus the client
will potentially see checksum errors or EOFs. Additionally, the last checksum chunk of the
file is made accessible to readers even though it is not stable.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message