hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-6685) Change the generic serialization framework API to use serialization-specific bytes instead of Map<String,String> for configuration
Date Wed, 17 Nov 2010 00:05:17 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932723#action_12932723
] 

Owen O'Malley commented on HADOOP-6685:
---------------------------------------

{quote}
Indeed they aren't optional dependencies in the patch you have posted. But most are not Hadoop
dependencies at all before this patch.
{quote}

Avro is already a dependency. Thrift is already a dependency for HDFS (see HDFS-1484). I'm
only adding ProtocolBuffers, which is a commonly used serialization format that many users
including me find extremely useful.

{quote}
only introduce a serialization framework, not to add a number of implementations of it,
{quote}

The implementations are relatively small and are required for showing that the system actually
works.

{quote}
nor I would add, to extend existing container file formats to incorporate new serializations
{quote}

SequenceFile was updated to use the new io.serial interface. The other containers had never
been updated to use anything but Writable. Bringing them up to the new interface was part
of the work. Again, updating SequenceFile shows that this code actually works. Without such
a demonstration, this patch would be incomplete.



> Change the generic serialization framework API to use serialization-specific bytes instead
of Map<String,String> for configuration
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-6685
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6685
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>             Fix For: 0.22.0
>
>         Attachments: libthrift.jar, serial.patch, serial4.patch, serial6.patch, SerializationAtSummit.pdf
>
>
> Currently, the generic serialization framework uses Map<String,String> for the
serialization specific configuration. Since this data is really internal to the specific serialization,
I think we should change it to be an opaque binary blob. This will simplify the interface
for defining specific serializations for different contexts (MAPREDUCE-1462). It will also
move us toward having serialized objects for Mappers, Reducers, etc (MAPREDUCE-1183).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message