hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Hawthorne <dha...@3crowd.com>
Subject Re: map.input.file in 20.1
Date Tue, 13 Jul 2010 01:01:18 GMT
That was actually one of the things I tried before, but I did some more digging and found that
I was trying to use the old mapred FileSplit class instead of the new one.  Working code in
case other people need it:

import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.util.GenericOptionsParser;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.RecordWriter;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.FileSplit;				// not org.apache.hadoop.mapred.FileSplit
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.reduce.IntSumReducer;

public class Foo {

        public static class FooMapper extends Mapper<Object, Text, Text, IntWritable>
{

                Text input_file = new Text("default");

                public void setup (Context context) {

                        Configuration conf = context.getConfiguration();

                        FileSplit fileSplit = (FileSplit) context.getInputSplit();
                        String sFileName = fileSplit.getPath().toString();

                        input_file.set(sFileName);

                        //
                        // Just the filename portion:
                        // String sFileName = fileSplit.getPath().getName();
                        //
                }
	}
}


On Jul 12, 2010, at 4:18 PM, Ted Yu wrote:

> How about:
> FileSplit fileSplit = (FileSplit) context.getInputSplit();
> String sFileName = fileSplit.getPath().getName();
> 
> On Mon, Jul 12, 2010 at 2:56 PM, David Hawthorne <dhawth@3crowd.com> wrote:
> 
>> I'm trying to get the name of the file that the map job is operating on out
>> of the Context passed to the setup function.  It's proving harder than seems
>> proper.
>> 
>> I've found several links via google on this topic, but I've seen no
>> responses to previous questions.
>> 
>> We have this from July 17, 2009:
>> 
>> http://www.mail-archive.com/common-user@hadoop.apache.org/msg00535.html
>> 
>> I attempted that solution and javac complained about using a deprecated
>> API.
>> 
>> It's very clearly spelled out in this doc:
>> 
>> http://hadoop.apache.org/common/docs/r0.20.1/mapred_tutorial.html
>> 
>> and yet the example source code for 20.1 is still using the mapred.*
>> (deprecated) API that the prior link used as well.
>> 
>> For the record, here's what I've tried, in the hopes that someone will just
>> paste back a working solution:
>> 
>> import java.io.IOException;
>> 
>> import org.apache.hadoop.fs.Path;
>> import org.apache.hadoop.io.Text;
>> import org.apache.hadoop.io.IntWritable;
>> import org.apache.hadoop.conf.Configuration;
>> import org.apache.hadoop.util.GenericOptionsParser;
>> import org.apache.hadoop.mapreduce.Job;
>> import org.apache.hadoop.mapreduce.Mapper;
>> import org.apache.hadoop.mapreduce.Reducer;
>> import org.apache.hadoop.mapreduce.RecordWriter;
>> import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
>> import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
>> import org.apache.hadoop.mapreduce.lib.reduce.IntSumReducer;
>> import org.apache.hadoop.mapred.FileSplit;
>> 
>> public class Foo {
>>       public static class FooMapper extends Mapper<Object, Text, Text,
>> IntWritable> {
>> 
>>               private org.apache.hadoop.io.Text input_file;
>> 
>>               public void setup (Context context) {
>>                       Configuration conf = context.getConfiguration();
>> 
>>                       //
>>                       // fails to compile due to use of deprecated mapred
>> API:
>>                       //
>>                       FileSplit fileSplit =
>> (FileSplit)context.getInputSplit();
>>                       String input_fname = fileSplit.getPath().toString();
>>                       input_file.set(input_fname);
>> 
>>                       //
>>                       // results in null pointer exception because
>> conf.get returns null:
>>                       //
>>                       // input_file.set(conf.get("map.input.file"));
>>               }
>>       }
>> }
>> 
>> 


Mime
View raw message