hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Kimball (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-815) Add AvroInputFormat and AvroOutputFormat so that hadoop can use Avro Serialization
Date Thu, 14 Jan 2010 19:11:54 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12800309#action_12800309
] 

Aaron Kimball commented on MAPREDUCE-815:
-----------------------------------------

The only reason I could think of to use the position would be building some sort of index
over an avro file. I think this probably doesn't make much sense here. That having been said,
we can't use null or we'll break the identity mapper. (The MapOutputBuffer expects non-null
keys and values only. A {{context.write(k, null)}} from the mapper will throw NullPointerException.)


This is why writables included NullWritable, I think. We could add a type e.g. "Empty" which
implements AvroReflectSerializable and whose toString method returns the empty string; this
would work fairly transparently I think and be entirely avro-based.


> Add AvroInputFormat and AvroOutputFormat so that hadoop can use Avro Serialization
> ----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-815
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-815
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>            Reporter: Ravi Gummadi
>            Assignee: Aaron Kimball
>         Attachments: MAPREDUCE-815.patch
>
>
> MapReduce needs AvroInputFormat similar to other InputFormats like TextInputFormat to
be able to use avro serialization in hadoop. Similarly AvroOutputFormat is needed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message