hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-6165) Add metadata to Serializations
Date Thu, 13 Aug 2009 21:27:14 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742981#action_12742981
] 

Doug Cutting commented on HADOOP-6165:
--------------------------------------

> So there'd just be a single AvroSerialization?

Perhaps.  An alternative might be to have three: reflect, specific and generic.  Each could
accept records if they have the right base class.  But if you read a file that was written
with, e.g., specific and don't have that class, or data written by python, that names no class,
then you'd be unable to read that data.  Also, with Avro, you're not tied to records as the
schema: values could be a union, a map, or an array.

If the data was written with reflect or specific, and you have the class used to write it
loaded, then its probably best to use that.  But in all other cases generic is probably your
best bet.  I guess this could be implemented by placing generic last on the list, so that
it accepts anything that has an avro schema, with specific and reflect picking off things
that have classes loaded.  Is that better?  I don't have a strong feeling.

> Add metadata to Serializations
> ------------------------------
>
>                 Key: HADOOP-6165
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6165
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: contrib/serialization
>            Reporter: Tom White
>            Assignee: Tom White
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-6165-v2.patch, HADOOP-6165.patch
>
>
> The Serialization framework only allows a class to be passed as metadata. This assumes
there is a one-to-one mapping between types and Serializations, which is overly restrictive.
By permitting applications to pass arbitrary metadata to Serializations, they can get more
control over which Serialization is used, and would also allow, for example, one to pass an
Avro schema to an Avro Serialization.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message