Return-Path: Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: (qmail 68252 invoked from network); 25 Jan 2010 16:03:48 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 25 Jan 2010 16:03:48 -0000 Received: (qmail 10076 invoked by uid 500); 25 Jan 2010 16:03:47 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 10003 invoked by uid 500); 25 Jan 2010 16:03:47 -0000 Mailing-List: contact hdfs-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-user@hadoop.apache.org Delivered-To: mailing list hdfs-user@hadoop.apache.org Received: (qmail 9992 invoked by uid 99); 25 Jan 2010 16:03:47 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Jan 2010 16:03:47 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [166.84.7.136] (HELO vc136.vc.panix.com) (166.84.7.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Jan 2010 16:03:39 +0000 Received: from localhost (localhost [127.0.0.1]) by vc136.vc.panix.com (Postfix) with ESMTP id B7B01DC4EA for ; Mon, 25 Jan 2010 11:03:18 -0500 (EST) Received: from vc136.vc.panix.com ([127.0.0.1]) by localhost (vc136.vc.panix.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id Omc3ZqCeBk0C for ; Mon, 25 Jan 2010 11:03:12 -0500 (EST) Received: from eric-sammers-macbook-pro.local (pool-96-246-67-178.nycmny.east.verizon.net [96.246.67.178]) by vc136.vc.panix.com (Postfix) with ESMTP id 64FB6DC380 for ; Mon, 25 Jan 2010 11:03:12 -0500 (EST) Message-ID: <4B5DC0BF.6060702@lifeless.net> Date: Mon, 25 Jan 2010 11:03:11 -0500 From: Eric Sammer User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.1.7) Gecko/20100111 Thunderbird/3.0.1 MIME-Version: 1.0 To: hdfs-user@hadoop.apache.org Subject: Re: HDFS File read issue References: <000901ca9d8e$ecb01500$2501120a@china.huawei.com> In-Reply-To: <000901ca9d8e$ecb01500$2501120a@china.huawei.com> X-Enigmail-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit On 1/25/10 2:20 AM, MOHAMMED IRFANULLA S wrote: > Hi, > > I'm using hadoop 0.20.1. I would appreciate any help on the following > issue in HDFS. > > User1 has created a file file1.txt and started writing to this > file(Writer thread). > User2 and user3 try to read from this file. But cannot read anything > until atleast one of the blocks is complete. and they cannot read any > block under development. (Reader threads) > > Is it possible to block/prevent user2 and User3 from reading the > file1.txt completely until the Writer thread calls close(). > If possible, how to achieve it ? Mohammed: This is the documented behavior of HDFS with regard to data visibility. Currently, there is no way to prevent block access from user 2 and 3 in your scenario until user 1 finishes writing; you'd have to implement it at a layer higher up than HDFS. In theory, one can force a sync by calling FSDataOutputStream#sync() but I think that's still buggy (slated for fix in 0.21.x? - see HDFS-200[1]). This would trade performance for visibility. I think the alternative of forcing the readers to block until some event triggered (after the file is completely written) by the writer is a better plan, though. Hope this helps. [1] - https://issues.apache.org/jira/browse/HDFS-200 -- Eric Sammer eric@lifeless.net http://esammer.blogspot.com