hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joey Echeverria <j...@cloudera.com>
Subject Re: Hadoop Question
Date Thu, 28 Jul 2011 12:22:27 GMT
How about having the slave write to temp file first, then move it to the file the master is
monitoring for after they close it?


On Jul 27, 2011, at 22:51, Nitin Khandelwal <nitin.khandelwal@germinait.com> wrote:

> Hi All,
> How can I determine if a file is being written to (by any thread) in HDFS. I
> have a continuous process on the master node, which is tracking a particular
> folder in HDFS for files to process. On the slave nodes, I am creating files
> in the same folder using the following code :
> At the slave node:
> import org.apache.commons.io.IOUtils;
> import org.apache.hadoop.fs.FileSystem;
> import java.io.OutputStream;
> OutputStream oStream = fileSystem.create(path);
> IOUtils.write(<Some String>, oStream);
> IOUtils.closeQuietly(oStream);
> At the master node,
> I am getting the earliest modified file in the folder. At times when I try
> reading the file, I get nothing in the file, mostly because the slave might
> be still finishing writing to the file. Is there any way, to somehow tell
> the master, that the slave is still writing to the file and to check the
> file sometime later for actual content.
> Thanks,
> -- 
> Nitin Khandelwal

View raw message