avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Iván de Prado (JIRA) <j...@apache.org>
Subject [jira] Commented: (AVRO-493) hadoop mapreduce support for avro data
Date Mon, 21 Jun 2010 08:14:26 GMT

    [ https://issues.apache.org/jira/browse/AVRO-493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12880743#action_12880743
] 

Iván de Prado commented on AVRO-493:
------------------------------------

Writting a custom DeserializerCompartor is needed if you want this patch to be useful in many
developments. Otherwise you would need a different Avro schema with a different sorting for
each kind of grouping you want to do in the reducer. I'm failing to create a custom DeserializerComparator:

{code:java}
  public static class CustomComparator extends DeserializerComparator<AvroWrapper<GenericRecord>>
{

	public CustomComparator() throws IOException {
		super(new AvroKeySerialization().getDeserializer(AvroWrapper.class));
	}

	@Override
	public int compare(AvroWrapper<GenericRecord> o1, AvroWrapper<GenericRecord>
o2) {
		
		return o1.datum().get("word").toString().charAt(1)-o2.datum().get("word").toString().charAt(1);
	}
  }
 {code}

It raises the following exception:

{noformat}
Caused by: java.lang.NullPointerException
	at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:98)
	at org.apache.avro.mapred.AvroKeySerialization.getDeserializer(AvroKeySerialization.java:55)
        ....
{noformat}

The problem is in that line:

{code:java}
    Schema schema = AvroJob.getMapOutputSchema(getConf());
{code}

It is looking for the datum schema at the job configuration but unsurprisingly it is not there.

Any ideas or workarrounds for creating custom Comparators for Avro? 

> hadoop mapreduce support for avro data
> --------------------------------------
>
>                 Key: AVRO-493
>                 URL: https://issues.apache.org/jira/browse/AVRO-493
>             Project: Avro
>          Issue Type: New Feature
>          Components: java
>            Reporter: Doug Cutting
>            Assignee: Doug Cutting
>             Fix For: 1.4.0
>
>         Attachments: AVRO-493.patch, AVRO-493.patch
>
>
> Avro should provide support for using Hadoop MapReduce over Avro data files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message