hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1986) Add support for a general serialization mechanism for Map Reduce
Date Tue, 06 Nov 2007 18:22:51 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12540516

Doug Cutting commented on HADOOP-1986:

> X should allow creation of multiple instances of its serializer [ ...]

I don't see how this is required by anything I've proposed.  If class names are serialized
with class data, then a single serializer instance could be returned for a large number of
different classes.  If class names are not serialized with class data, then a different serializer
instance could be returned for each class, but these could be cached, so that no more than
a single instance is created per serialized class.  If a factory creates multiple instances
of its serializer, and those instances share state, then yes, they are responsible for coordinating
their state.  That seems reasonable and expected.

> X needs to be able to both create objects before deserializing them and take in a reference
to an object and initialize its member variables with deserialized data.

No.  It must be able to create instances, but it need not use a passed in reference.  As an
optimization it may use a reference passed in when optimized client code passes in non-null
references.  The implementation and use of references is optional.

> At this point, I think it's a gut call. If we feel that having clients not replicate
platform logic is more important than the restrictions we're providing on serialization platforms,
that's fine.

Yes, I think having clients not replicate platform logic is a mandate.  The framework should
maximally encapsulate serialization logic.  But I still don't see what onerous restrictions
this inflicts on serialization platforms.

> Add support for a general serialization mechanism for Map Reduce
> ----------------------------------------------------------------
>                 Key: HADOOP-1986
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1986
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Tom White
>            Assignee: Tom White
>             Fix For: 0.16.0
>         Attachments: SerializableWritable.java, serializer-v1.patch
> Currently Map Reduce programs have to use WritableComparable-Writable key-value pairs.
While it's possible to write Writable wrappers for other serialization frameworks (such as
Thrift), this is not very convenient: it would be nicer to be able to use arbitrary types
directly, without explicit wrapping and unwrapping.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message