hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Sammer <e...@lifeless.net>
Subject Re: HDFS File read issue
Date Mon, 25 Jan 2010 16:03:11 GMT
On 1/25/10 2:20 AM, MOHAMMED IRFANULLA S wrote:
> Hi,
> I'm using hadoop 0.20.1. I would appreciate any help on the following
> issue in HDFS.
> User1 has created a file file1.txt and started writing to this
> file(Writer thread).
> User2 and user3 try to read from this file. But cannot read anything
> until atleast one of the blocks is complete. and they cannot read any
> block under development. (Reader threads)
> Is it possible to block/prevent user2 and User3 from reading the
> file1.txt completely until the Writer thread calls close().
> If possible, how to achieve it ?


This is the documented behavior of HDFS with regard to data visibility.
Currently, there is no way to prevent block access from user 2 and 3 in
your scenario until user 1 finishes writing; you'd have to implement it
at a layer higher up than HDFS.

In theory, one can force a sync by calling FSDataOutputStream#sync() but
I think that's still buggy (slated for fix in 0.21.x? - see
HDFS-200[1]). This would trade performance for visibility. I think the
alternative of forcing the readers to block until some event triggered
(after the file is completely written) by the writer is a better plan,

Hope this helps.

[1] - https://issues.apache.org/jira/browse/HDFS-200

Eric Sammer

View raw message