avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Something Something <mailinglist...@gmail.com>
Subject Re: Reading AVRO files in Java MapReduce
Date Thu, 17 Nov 2011 07:20:32 GMT
Sorry.  Don't worry about this for now.  Made progress.  The input AVRO
file was bad.  Thanks.

On Wed, Nov 16, 2011 at 11:13 PM, Sudharsan Sampath <sudhan65@gmail.com>wrote:

> What's the output logs from your job..
>
>
> On Thu, Nov 17, 2011 at 5:30 AM, Something Something <
> mailinglists19@gmail.com> wrote:
>
>> Why could this not be working?  Any ideas?  I even put a 'throw new
>> RuntimeException' to see if it's coming to the Mapper, but it isn't.
>>  Thanks for the help.
>>
>> Code snippet:
>>
>>
>>
>>     public static class MapImpl extends AvroMapper<Utf8, Pair<Utf8,
>> Long>> {
>>
>>         @Override
>>
>>         public void map(Utf8 text, AvroCollector<Pair<Utf8, Long>>
>> collector,
>>
>>                         Reporter reporter) throws IOException {
>>
>>         throw new RuntimeException("my text: " + text.toString());
>>
>> //        System.out.println("my text" + text);
>>
>> //
>>
>> //            collector.collect(new Pair<Utf8, Long>(text, 1L));
>>
>>         }
>>
>>     }
>>
>>
>>
>>     private static class NonAvroReducer
>>
>>             extends MapReduceBase
>>
>>             implements Reducer<AvroKey<Utf8>, AvroValue<Long>, Text,
>> Text> {
>>
>>
>>         public void reduce(AvroKey<Utf8> key, Iterator<AvroValue<Long>>
>> values,
>>
>>                            OutputCollector<Text, Text> out,
>>
>>                            Reporter reporter) throws IOException {
>>
>>         out.collect(new Text(key.toString()),
>>
>>                     new Text("Testing"));
>>
>>             while (values.hasNext()) {
>>
>>                 AvroValue<Long> value = values.next();
>>
>>                 out.collect(new Text(key.toString()),
>>
>>                         new Text(value.datum().toString()));
>>
>>             }
>>
>>         }
>>
>>     }
>>
>>
>>  public static void main(String[] args) throws Exception {
>>
>>
>>
>>         String dir = "/user/mydir";
>>
>>
>>
>>         JobConf job = new JobConf(new Configuration(), TestAvroProcessor.
>> class);
>>
>>         job.setJobName(TestAvroProcessor.class.getName());
>>
>>
>>
>>         Path outputPath = new Path(dir + "/out");
>>
>>
>>         outputPath.getFileSystem(job).delete(outputPath);
>>
>>
>>
>>         AvroJob.setInputSchema(job, Schema.parse(new File(
>> "/Users/mydir/profiles.json")));
>>
>>         AvroJob.setOutputSchema(job, SCHEMA);
>>
>>
>>         AvroJob.setMapperClass(job, MapImpl.class);
>>
>>
>>         FileInputFormat.setInputPaths(job, new Path(dir + "/data"));
>>
>>         FileOutputFormat.setOutputPath(job, outputPath);
>>
>>         FileOutputFormat.setCompressOutput(job, false);
>>
>>
>>
>>         job.setReducerClass(NonAvroReducer.class);
>>
>>         job.setOutputFormat(TextOutputFormat.class);
>>
>>         job.setOutputKeyClass(Text.class);
>>
>>         job.setOutputValueClass(Text.class);
>>
>>
>>
>>         JobClient.runJob(job);
>>
>>
>>
>>
>>
>>     }
>>
>>
>>
>>
>> On Tue, Nov 15, 2011 at 5:12 PM, Doug Cutting <cutting@apache.org> wrote:
>>
>>> On 11/15/2011 03:16 PM, Something Something wrote:
>>> > Quick question.  I want the output from AvroJob.setReducerClass to be
>>> in
>>> > regular Text files - not in AVRO format.  Can I do that?  Any examples?
>>> >  Sorry, kinda short on time to do research.  Thanks.
>>>
>>> On the previously cited documentation page:
>>>
>>>
>>> http://avro.apache.org/docs/current/api/java/org/apache/avro/mapred/package-summary.html
>>>
>>> Look for the text, "For jobs whose input is an Avro data file and which
>>> use an AvroMapper, but whose reducer is a non-Avro Reducer and whose
>>> output is a non-Avro format".
>>>
>>> A sample of a job that does this is at:
>>>
>>>  http://s.apache.org/MsG
>>>
>>> Just use TextOutputFormat instead of SequenceFileOutputFormat.
>>>
>>> Doug
>>>
>>
>>
>

Mime
View raw message