hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kunal Gupta <ku...@techlead-india.com>
Subject Re: How to write a custom input format and record reader to read multiple lines of text from files
Date Tue, 01 Dec 2009 06:57:49 GMT
Can you kindly guide me on what initialisation i need to do in the
implemented class constructor - MultiLineFileInputFormat?

i was following the sample provided on this yahoo page:

http://developer.yahoo.com/hadoop/tutorial/module5.html#fileformat




On Tue, 2009-12-01 at 06:45 +0000, Sean Owen wrote:
> It sounds like you have no provided a no-arg constructor in
> MultiLineFileInputFormat.
> 
> On Tue, Dec 1, 2009 at 6:17 AM, Kunal Gupta <kunal@techlead-india.com> wrote:
> > Can someone explain how to override the "FileInputFormat" and
> > "RecordReader" in order to be able to read multiple lines of text from
> > input files in a single map task?
> >
> > Here the key will be the offset of the first line of text and value will
> > be the N lines of text.
> >
> > I have overridden the class FileInputFormat:
> >
> > public class MultiLineFileInputFormat
> >        extends FileInputFormat<LongWritable, Text>{
> > ...
> > }
> >
> > and implemented the abstract method:
> >
> > public RecordReader createRecordReader(InputSplit split,
> >                TaskAttemptContext context)
> >         throws IOException, InterruptedException {...}
> >
> > I have also overridden the recordreader class:
> >
> > public class MultiLineFileRecordReader extends
> > RecordReader<LongWritable, Text>
> > {...}
> >
> > and in the job configuration, specified this new InputFormat class:
> >
> > job.setInputFormatClass(MultiLineFileInputFormat.class);
> >
> > --------------------------------------------------------------------------
> > When I  run this new map/reduce program, i get the following java error:
> > --------------------------------------------------------------------------
> > Exception in thread "main" java.lang.RuntimeException:
> > java.lang.NoSuchMethodException: CustomRecordReader
> > $MultiLineFileInputFormat.<init>()
> >        at
> > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:115)
> >        at
> > org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:882)
> >        at
> > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
> >        at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
> >        at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
> >        at CustomRecordReader.main(CustomRecordReader.java:257)
> > Caused by: java.lang.NoSuchMethodException: CustomRecordReader
> > $MultiLineFileInputFormat.<init>()
> >        at java.lang.Class.getConstructor0(Class.java:2706)
> >        at java.lang.Class.getDeclaredConstructor(Class.java:1985)
> >        at
> > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:109)
> >        ... 5 more
> >
> >
> 


Mime
View raw message