hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tom White (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1986) Add support for a general serialization mechanism for Map Reduce
Date Fri, 19 Oct 2007 08:58:51 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12536148

Tom White commented on HADOOP-1986:


These changes generally look good - I'll try to work them into a new patch.

In the current patch Serializers and Deserializers are stateful with open/close methods and
that was the reason that led me to separate them. We could combine them in a single object,
but this would be at the expense of muddying the method names: (e.g. closeSerializer and closeDeserializer),
so I'm reluctant to do that - I would stick with Doug's first SerializationFactory proposal
(plus the accept method).

Another aspect that the current patch doesn't address is who instantiates objects during deserialization.
(Doug - I think you're alluding to this in the "reuse" object in the Serializer class above?)
For Writables and Thrift the serialization framework does not instantiate objects - it merely
populates the supplied object with the representation from the stream. For Java Serialization
the serialization framework reads the type from the stream and instantiates an object for
that type. To cater for this difference we need to make the Deserializer expose whether it
can reuse types so that the client (for example ReduceTask) knows whether to hand it an object
or not. This is needed for efficiency (so the client doesn't needlessly create objects that
aren't used) and also since some serialization frameworks don't require classes to have no-arg
constructors (so the client would not be able to create the required object in any case).

> Add support for a general serialization mechanism for Map Reduce
> ----------------------------------------------------------------
>                 Key: HADOOP-1986
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1986
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Tom White
>            Assignee: Tom White
>             Fix For: 0.16.0
>         Attachments: SerializableWritable.java, serializer-v1.patch
> Currently Map Reduce programs have to use WritableComparable-Writable key-value pairs.
While it's possible to write Writable wrappers for other serialization frameworks (such as
Thrift), this is not very convenient: it would be nicer to be able to use arbitrary types
directly, without explicit wrapping and unwrapping.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message