hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file
Date Tue, 04 May 2010 19:33:58 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12863961#action_12863961
] 

Todd Lipcon commented on HDFS-1057:
-----------------------------------

Some comments:

- IOUtils.readFileChannelFully:
-- IOUtils.readFully spelled "premature" wrong, but we might as well not duplicate the typo
:)
-- no need to increment the 'off' variable since the bytebuffer internally deals with that
-- can you use ByteBuffer.hasRemaining() in the loop instead of updating the toRead variable?

- ChecksumUtil:
-- Missing apache license header
-- Since it only contains static methods, either make it an abstract class or give it a private
constructor
-- Perhaps add an assert like: assert checksumOff + numChunks * checksumSize < dataOff;
 (to make sure there's enough space in the checksum buffer)

- Generally, have you verified through strace or other means that transferto is still being
used properly for normal transfers? The logic in BlockSender is getting very complicated,
it's hard to verify by looking at it.

> Concurrent readers hit ChecksumExceptions if following a writer to very end of file
> -----------------------------------------------------------------------------------
>
>                 Key: HDFS-1057
>                 URL: https://issues.apache.org/jira/browse/HDFS-1057
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: data-node
>    Affects Versions: 0.21.0, 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: sam rash
>            Priority: Blocker
>         Attachments: conurrent-reader-patch-1.txt, conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt
>
>
> In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before calling flush().
Therefore, if there is a concurrent reader, it's possible to race here - the reader will see
the new length while those bytes are still in the buffers of BlockReceiver. Thus the client
will potentially see checksum errors or EOFs. Additionally, the last checksum chunk of the
file is made accessible to readers even though it is not stable.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message