hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Laxman <lakshman...@huawei.com>
Subject RE: Reader/Writer problem in HDFS
Date Thu, 28 Jul 2011 11:19:43 GMT
No such API as per my knowledge.
copyFromLocal is such API. That may not fit in your scenario I guess.


-----Original Message-----
From: Meghana [mailto:meghana.marathe@germinait.com] 
Sent: Thursday, July 28, 2011 4:32 PM
To: hdfs-user@hadoop.apache.org; lakshman_ch@huawei.com
Cc: common-user@hadoop.apache.org
Subject: Re: Reader/Writer problem in HDFS

Thanks Laxman! That would definitely help things. :)

Is there a better FileSystem/other method call to create a file in one go
(i.e. atomic i guess?), without having to call create() and then write to
the stream?


On 28 July 2011 16:12, Laxman <lakshman_ch@huawei.com> wrote:

> One approach can be use some ".tmp" extension while writing. Once the
> is completed rename back to original file name. Also, reducer has to
> out ".tmp" files.
> This will ensure reducer will not pickup the partial files.
> We do have the similar scenario where the a/m approach resolved the issue.
> -----Original Message-----
> From: Meghana [mailto:meghana.marathe@germinait.com]
> Sent: Thursday, July 28, 2011 1:38 PM
> To: common-user; hdfs-user@hadoop.apache.org
> Subject: Reader/Writer problem in HDFS
> Hi,
> We have a job where the map tasks are given the path to an output folder.
> Each map task writes a single file to that folder. There is no reduce
> phase.
> There is another thread, which constantly looks for new files in the
> folder. If found, it persists the contents to index, and deletes the file.
> We use this code in the map task:
> try {
>    OutputStream oStream = fileSystem.create(path);
>    IOUtils.write("xyz", oStream);
> } finally {
>    IOUtils.closeQuietly(oStream);
> }
> The problem: Some times the reader thread sees & tries to read a file
> is not yet fully written to HDFS (or the checksum is not written yet,
> and throws an error. Is it possible to write an HDFS file in such a way
> that
> it won't be visible until it is fully written?
> We use Hadoop 0.20.203.
> Thanks,
> Meghana

View raw message