hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jagaran das <jagaran_...@yahoo.co.in>
Subject Re: HDFS File Appending URGENT
Date Fri, 17 Jun 2011 04:51:51 GMT
Thanks a lot Xiabo.

I have tried with the  below code in HDFS version 0.20.20 and it worked.
Is it not stable yet?

public class HadoopFileWriter {
public static void main (String [] args) throws Exception{
try{
URI uri = new 
URI("hdfs://localhost:9000/Users/jagarandas/Work-Assignment/Analytics/analytics-poc/hadoop-0.20.203.0/data/test.dat");

Path pt=new Path(uri);
FileSystem fs = FileSystem.get(new Configuration());
BufferedWriter br;
if(fs.isFile(pt)){
br=new BufferedWriter(new OutputStreamWriter(fs.append(pt)));
 br.newLine();
}else{
 br=new BufferedWriter(new OutputStreamWriter(fs.create(pt,true)));
}
String line = args[0];
System.out.println(line);
br.write(line);
br.close();
}catch(Exception e){
e.printStackTrace();
System.out.println("File not found");
}
}
}

Thanks a lot for your help.

Regards,
Jagaran 




________________________________
From: Xiaobo Gu <guxiaobo1982@gmail.com>
To: common-user@hadoop.apache.org
Sent: Thu, 16 June, 2011 8:01:14 PM
Subject: Re: HDFS File Appending URGENT

You can merge multiple files into a new one, there is no means to
append to a existing file.

On Fri, Jun 17, 2011 at 10:29 AM, jagaran das <jagaran_das@yahoo.co.in> wrote:
> Is the hadoop version Hadoop 0.20.203.0 API
>
> That means still the hadoop files in HDFS version 0.20.20  are immutable?
> And there is no means we can append to an existing file in HDFS?
>
> We need to do this urgently as we have do set up the pipeline accordingly in
> production?
>
> Regards,
> Jagaran
>
>
>
> ________________________________
> From: Xiaobo Gu <guxiaobo1982@gmail.com>
> To: common-user@hadoop.apache.org
> Sent: Thu, 16 June, 2011 6:26:45 PM
> Subject: Re: HDFS File Appending
>
> please refer to FileUtil.CopyMerge
>
> On Fri, Jun 17, 2011 at 8:33 AM, jagaran das <jagaran_das@yahoo.co.in> wrote:
>> Hi,
>>
>> We have a requirement where
>>
>>  There would be huge number of small files to be pushed to hdfs and then use
>>pig
>> to do analysis.
>>  To get around the classic "Small File Issue" we merge the files and push a
>> bigger file in to HDFS.
>>  But we are loosing time in this merging process of our pipeline.
>>
>> But If we can directly append to an existing file in HDFS we can save this
>> "Merging Files" time.
>>
>> Can you please suggest if there a newer stable version of Hadoop where can go
>> for appending ?
>>
>> Thanks and Regards,
>> Jagaran
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message