hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: How to write a custom input format and record reader to read multiple lines of text from files
Date Tue, 01 Dec 2009 06:45:24 GMT
It sounds like you have no provided a no-arg constructor in
MultiLineFileInputFormat.

On Tue, Dec 1, 2009 at 6:17 AM, Kunal Gupta <kunal@techlead-india.com> wrote:
> Can someone explain how to override the "FileInputFormat" and
> "RecordReader" in order to be able to read multiple lines of text from
> input files in a single map task?
>
> Here the key will be the offset of the first line of text and value will
> be the N lines of text.
>
> I have overridden the class FileInputFormat:
>
> public class MultiLineFileInputFormat
>        extends FileInputFormat<LongWritable, Text>{
> ...
> }
>
> and implemented the abstract method:
>
> public RecordReader createRecordReader(InputSplit split,
>                TaskAttemptContext context)
>         throws IOException, InterruptedException {...}
>
> I have also overridden the recordreader class:
>
> public class MultiLineFileRecordReader extends
> RecordReader<LongWritable, Text>
> {...}
>
> and in the job configuration, specified this new InputFormat class:
>
> job.setInputFormatClass(MultiLineFileInputFormat.class);
>
> --------------------------------------------------------------------------
> When I  run this new map/reduce program, i get the following java error:
> --------------------------------------------------------------------------
> Exception in thread "main" java.lang.RuntimeException:
> java.lang.NoSuchMethodException: CustomRecordReader
> $MultiLineFileInputFormat.<init>()
>        at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:115)
>        at
> org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:882)
>        at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
>        at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
>        at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
>        at CustomRecordReader.main(CustomRecordReader.java:257)
> Caused by: java.lang.NoSuchMethodException: CustomRecordReader
> $MultiLineFileInputFormat.<init>()
>        at java.lang.Class.getConstructor0(Class.java:2706)
>        at java.lang.Class.getDeclaredConstructor(Class.java:1985)
>        at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:109)
>        ... 5 more
>
>

Mime
View raw message