Return-Path: Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: (qmail 83777 invoked from network); 22 Jun 2010 22:36:14 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 22 Jun 2010 22:36:14 -0000 Received: (qmail 43211 invoked by uid 500); 22 Jun 2010 22:36:14 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 43149 invoked by uid 500); 22 Jun 2010 22:36:13 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 43141 invoked by uid 99); 22 Jun 2010 22:36:13 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Jun 2010 22:36:13 +0000 X-ASF-Spam-Status: No, hits=-1537.9 required=10.0 tests=ALL_TRUSTED,AWL X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Jun 2010 22:36:13 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o5MMZpJs023077 for ; Tue, 22 Jun 2010 22:35:52 GMT Message-ID: <24448530.2331277246151895.JavaMail.jira@thor> Date: Tue, 22 Jun 2010 18:35:51 -0400 (EDT) From: "sam rash (JIRA)" To: hdfs-issues@hadoop.apache.org Subject: [jira] Commented: (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file In-Reply-To: <843484796.396631269216807264.JavaMail.jira@brutus.apache.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881421#action_12881421 ] sam rash commented on HDFS-1057: -------------------------------- i have an updated patch, but it does not yet handle the missing replicas as 0 sized for under construction. there may be other 20 patches to port to make this happen. > Concurrent readers hit ChecksumExceptions if following a writer to very end of file > ----------------------------------------------------------------------------------- > > Key: HDFS-1057 > URL: https://issues.apache.org/jira/browse/HDFS-1057 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: data-node > Affects Versions: 0.20-append, 0.21.0, 0.22.0 > Reporter: Todd Lipcon > Assignee: sam rash > Priority: Blocker > Fix For: 0.20-append > > Attachments: conurrent-reader-patch-1.txt, conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, hdfs-1057-trunk-3.txt > > > In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before calling flush(). Therefore, if there is a concurrent reader, it's possible to race here - the reader will see the new length while those bytes are still in the buffers of BlockReceiver. Thus the client will potentially see checksum errors or EOFs. Additionally, the last checksum chunk of the file is made accessible to readers even though it is not stable. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.