hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Douglas (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-6685) Change the generic serialization framework API to use serialization-specific bytes instead of Map<String,String> for configuration
Date Fri, 12 Nov 2010 19:15:17 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931496#action_12931496
] 

Chris Douglas commented on HADOOP-6685:
---------------------------------------

{quote}Much of this patch seems like it could help
implement these, but parts of it (e.g., the metadata serialization,
enhancements to SequenceFile, etc.) don't seem relevant to these
goals. I don't see supporting multiple Java serialization APIs as a
goal in and of itself.{quote}

If one's records don't implement the {{Writable}} interface, then there's no reasonable binary
container in Hadoop. Adding these capabilities to {{SequenceFile}} and {{TFile}} is an improvement.
It's not about cross-language capability.

{quote}It would be useful if the shuffle could process things besides
Writable (MAPREDUCE-1126) and it would be useful to have InputFormats
and OutputFormats for language-independent file formats like Avro's
(MAPREDUCE-815).{quote}

Importing past issues will not move this forward. You recognize that this makes progress toward
pushing other serializations through the data pipeline. That is obviously the next step.

> Change the generic serialization framework API to use serialization-specific bytes instead
of Map<String,String> for configuration
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-6685
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6685
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>         Attachments: serial.patch
>
>
> Currently, the generic serialization framework uses Map<String,String> for the
serialization specific configuration. Since this data is really internal to the specific serialization,
I think we should change it to be an opaque binary blob. This will simplify the interface
for defining specific serializations for different contexts (MAPREDUCE-1462). It will also
move us toward having serialized objects for Mappers, Reducers, etc (MAPREDUCE-1183).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message