hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Palleti, Pallavi" <pallavi.pall...@corp.aol.com>
Subject RE: File is closed but data is not visible
Date Tue, 11 Aug 2009 16:52:23 GMT
Hi Jason,

Apologies for missing version information in my previous mail. I am
using hadoop-0.18.3. I am getting FSDataOutputStream object using
fs.create(new Path(some_file_name)), where fs is FileSystem object. And,
I am closing the file using close(). 


-----Original Message-----
From: Jason Venner [mailto:jason.hadoop@gmail.com] 
Sent: Tuesday, August 11, 2009 6:24 PM
To: common-user@hadoop.apache.org
Subject: Re: File is closed but data is not visible

Please provide information on what version of hadoop you are using and
method of opening and closing the file.

On Tue, Aug 11, 2009 at 12:48 AM, Pallavi Palleti <
pallavi.palleti@corp.aol.com> wrote:

> Hi all,
> We have an application where we pull logs from an external server(far
> from hadoop cluster) to hadoop cluster. Sometimes, we could see huge
> (of 1 hour or more) in actually seeing the data in HDFS though the
file has
> been closed and the variable is set to null from the external
> was in the impression that when I close the file, the data gets
reflected in
> hadoop cluster. Now, in this situation, it is even more complicated to
> handle write failures as it is giving false impression to the client
> data has been written to HDFS. Kindly clarify if my perception is
> If yes, Could some one tell me what is causing the delay in actually
> the data. During those cases, how can we tackle write failures (due to
> temporary issues like data node not available, disk is full) as there
is no
> way, we can figure out the failure at the client side?
> Thanks
> Pallavi

Pro Hadoop, a book to guide you from beginner to hadoop mastery,
www.prohadoopbook.com a community for Hadoop Professionals

View raw message