hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Kimball <aa...@cloudera.com>
Subject Re: streaming with custom input format
Date Mon, 19 Oct 2009 20:09:21 GMT
Keith,

In 0.20, a new API was introduced in parallel with the older API to
MapReduce. It seems as though streaming requires the old interface
(org.apache.hadoop.mapred.InputFormat), and doesn't work with the new one
(org.apache.hadoop.mapreduce.InputFormat). You should file a bug report on
http://issues.apache.org/jira/browse/MAPREDUCE

As a workaround in the meantime, you could implement the
org.apache.hadoop.mapred.InputFormat interface by extending
org.apache.hadoop.mapred.TextInputFormat. You'd also need to use the old-API
RecordReader class too. I'm not sure how much change this would really
require in your code.

The incompatibilities between the APIs are subtle and unfortunate; sorry for
the confusion you're having.
- Aaron

On Sat, Oct 17, 2009 at 7:27 PM, Keith Jackson <krjackson@lbl.gov> wrote:

> Hi,
> I've written custom input format to use with streaming, but I'm having
> trouble making it work. I pass in -inputformat <input format class> and I
> get the following error:
> Exception in thread "main" java.lang.RuntimeException: class
> gov.lbl.acs.FASTAInputFormat not org.apache.hadoop.mapred.InputFormat
>
> I'm using hadoop-0.20.1.
>
> My input format class looks like:
> package gov.lbl.acs;
>
> import org.apache.hadoop.io.LongWritable;
> import org.apache.hadoop.io.Text;
> import org.apache.hadoop.mapreduce.InputSplit;
> import org.apache.hadoop.mapreduce.RecordReader;
> import org.apache.hadoop.mapreduce.TaskAttemptContext;
> import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
>
> public class FASTAInputFormat extends TextInputFormat {
>
>    @Override
>    public RecordReader<LongWritable, Text> createRecordReader(InputSplit
> inputSplit, TaskAttemptContext taskAttemptContext) {
>        return new FASTARecordReader();
>    }
> }
>
> I'm puzzled as to what I'm doing wrong. Any help would be greatly
> appreciated.
> thx,
> --keith
>
> --------------------------------------------------------------------------------------------------------
> Keith R. Jackson                                     email:
> KRJackson@lbl.gov
> MS: 50B-2239                                         phone: 510-486-4401
> Lawrence Berkeley National Lab        url:
> http://www-itg.lbl.gov/~kjackson/ <http://www-itg.lbl.gov/%7Ekjackson/>
>
> ----------------------------------------------------------------------------------------------------------
>
>
>

Mime
View raw message