hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gokulakannan M <gok...@huawei.com>
Subject hadoop 0.20 append - some clarifications
Date Thu, 10 Feb 2011 15:11:34 GMT
Hi All,

I have run the hadoop 0.20 append branch . Can someone please clarify the
following behavior?

A writer writing a file but he has not flushed the data and not closed the
file. Could a parallel reader read this partial file? 

For example,

1. a writer is writing a 10MB file(block size 2 MB) 

2. wrote the file upto 5MB (2 finalized blocks + 1 blockBeingWritten) . note
that writer is not calling FsDataOutputStream sync( ) at all

3. now a reader tries to read the above partially written file

I can be able to see that the reader can be able to see the partially
written 5MB data but I feel the reader should be able to see the data only
after the writer calls sync() api. 

Is this the correct behavior or my understanding is wrong?




View raw message