hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "sam rash (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file
Date Thu, 08 Apr 2010 22:30:38 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855189#action_12855189

sam rash commented on HDFS-1057:

re: recovery code
I will check with dhruba about how 20 handles the recovery (not familiar with that part of
the code yet)

re: tests
great, thanks. I was able to construct a case that deterministically fails:

1. writer opens a file, wites and syncs some # of bytes that includes a partial chunk
2. reader opens that stream, reads some bytes (to make the datanode open the meta data and
block data streams)
3. writer writes additional bytes that fill out the partial chunk
4. reader continues reading to the end of file when it opened
5. reader throws CRC error

I will see if I can construct deterministic test cases for these other ones or use directly
as well.

thanks again

> Concurrent readers hit ChecksumExceptions if following a writer to very end of file
> -----------------------------------------------------------------------------------
>                 Key: HDFS-1057
>                 URL: https://issues.apache.org/jira/browse/HDFS-1057
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: data-node
>    Affects Versions: 0.21.0, 0.22.0
>            Reporter: Todd Lipcon
>            Priority: Critical
> In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before calling flush().
Therefore, if there is a concurrent reader, it's possible to race here - the reader will see
the new length while those bytes are still in the buffers of BlockReceiver. Thus the client
will potentially see checksum errors or EOFs. Additionally, the last checksum chunk of the
file is made accessible to readers even though it is not stable.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message