avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Something Something <mailinglist...@gmail.com>
Subject Re: Reading AVRO files in Java MapReduce
Date Thu, 17 Nov 2011 00:00:12 GMT
Why could this not be working?  Any ideas?  I even put a 'throw new
RuntimeException' to see if it's coming to the Mapper, but it isn't.
 Thanks for the help.

Code snippet:

    public static class MapImpl extends AvroMapper<Utf8, Pair<Utf8, Long>> {


        public void map(Utf8 text, AvroCollector<Pair<Utf8, Long>>

                        Reporter reporter) throws IOException {

        throw new RuntimeException("my text: " + text.toString());

//        System.out.println("my text" + text);


//            collector.collect(new Pair<Utf8, Long>(text, 1L));



    private static class NonAvroReducer

            extends MapReduceBase

            implements Reducer<AvroKey<Utf8>, AvroValue<Long>, Text, Text>

        public void reduce(AvroKey<Utf8> key, Iterator<AvroValue<Long>>

                           OutputCollector<Text, Text> out,

                           Reporter reporter) throws IOException {

        out.collect(new Text(key.toString()),

                    new Text("Testing"));

            while (values.hasNext()) {

                AvroValue<Long> value = values.next();

                out.collect(new Text(key.toString()),

                        new Text(value.datum().toString()));




 public static void main(String[] args) throws Exception {

        String dir = "/user/mydir";

        JobConf job = new JobConf(new Configuration(), TestAvroProcessor.


        Path outputPath = new Path(dir + "/out");


        AvroJob.setInputSchema(job, Schema.parse(new File(

        AvroJob.setOutputSchema(job, SCHEMA);

        AvroJob.setMapperClass(job, MapImpl.class);

        FileInputFormat.setInputPaths(job, new Path(dir + "/data"));

        FileOutputFormat.setOutputPath(job, outputPath);

        FileOutputFormat.setCompressOutput(job, false);







On Tue, Nov 15, 2011 at 5:12 PM, Doug Cutting <cutting@apache.org> wrote:

> On 11/15/2011 03:16 PM, Something Something wrote:
> > Quick question.  I want the output from AvroJob.setReducerClass to be in
> > regular Text files - not in AVRO format.  Can I do that?  Any examples?
> >  Sorry, kinda short on time to do research.  Thanks.
> On the previously cited documentation page:
> http://avro.apache.org/docs/current/api/java/org/apache/avro/mapred/package-summary.html
> Look for the text, "For jobs whose input is an Avro data file and which
> use an AvroMapper, but whose reducer is a non-Avro Reducer and whose
> output is a non-Avro format".
> A sample of a job that does this is at:
>  http://s.apache.org/MsG
> Just use TextOutputFormat instead of SequenceFileOutputFormat.
> Doug

View raw message