hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsz Wo (Nicholas), SZE (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file
Date Wed, 18 May 2011 23:46:49 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035861#comment-13035861
] 

Tsz Wo (Nicholas), SZE commented on HDFS-1057:
----------------------------------------------

Hi Sam, I agree that there are bugs in Hadoop or any other software.  However, I don't see
any value of putting a test to fail every build but not fixing the so called "underlying problem".

BTW, it is questionable whether the test is a valid.  Are we supporting that many concurrent
readers?  How was the number chosen?  Machine has finite resource.  It is definitely easy
to create extreme tests and make them fail.

> Concurrent readers hit ChecksumExceptions if following a writer to very end of file
> -----------------------------------------------------------------------------------
>
>                 Key: HDFS-1057
>                 URL: https://issues.apache.org/jira/browse/HDFS-1057
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: data-node
>    Affects Versions: 0.20-append, 0.21.0, 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: sam rash
>            Priority: Blocker
>             Fix For: 0.20-append, 0.21.0, 0.22.0
>
>         Attachments: HDFS-1057-0.20-append.patch, conurrent-reader-patch-1.txt, conurrent-reader-patch-2.txt,
conurrent-reader-patch-3.txt, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, hdfs-1057-trunk-3.txt,
hdfs-1057-trunk-4.txt, hdfs-1057-trunk-5.txt, hdfs-1057-trunk-6.txt
>
>
> In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before calling flush().
Therefore, if there is a concurrent reader, it's possible to race here - the reader will see
the new length while those bytes are still in the buffers of BlockReceiver. Thus the client
will potentially see checksum errors or EOFs. Additionally, the last checksum chunk of the
file is made accessible to readers even though it is not stable.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message