hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amar Kamat <ama...@yahoo-inc.com>
Subject Re: Splitting in various files
Date Mon, 21 Apr 2008 06:33:01 GMT
Aayush Garg wrote:
> Could anyone please tell?
>
> On Sat, Apr 19, 2008 at 1:33 PM, Aayush Garg <aayush.garg@gmail.com> wrote:
>
>   
>> Hi,
>>
>> I have written the following code for writing my key,value pairs in the
>> file, and this file is then read by another MR.
>>
>>    Path pth = new Path("./dir1/dir2/filename");
>>    FileSystem fs = pth.getFileSystem(jobconf);
>>    SequenceFile.Writer sqwrite = new
>> SequenceFile.Writer(fs,conf,pth,Text.class,Custom.class);
>>    sqwrite.append(Key,value);
>>    sqwrite.close();
>>
>> I problem is I get my data written in one file(filename).. How can it be
>> split across in the number of files. If I give only the path of directory in
>>     
What do you mean by splitting a file across multiple files? If you want 
a separate file for each map/reduce task then you can use 
conf.get("mapred.task.id") to get the task id that is unique for that 
task. Now you can name the file like

Path pth = new Path("./dir1/dir2/" + filename + "-" + conf.get("mapred.task.id"));

Amar
>> this progam then it does not get compiled.
>>
>> I give only the path of directory /dir1/dir2 to another Map Reduce and it
>> reads the file.
>>
>> Thanks,
>>
>>
>>     
>
>
>   


Mime
View raw message