hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1986) Add support for a general serialization mechanism for Map Reduce
Date Tue, 06 Nov 2007 18:22:51 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12540516
] 

Doug Cutting commented on HADOOP-1986:
--------------------------------------

> X should allow creation of multiple instances of its serializer [ ...]

I don't see how this is required by anything I've proposed.  If class names are serialized
with class data, then a single serializer instance could be returned for a large number of
different classes.  If class names are not serialized with class data, then a different serializer
instance could be returned for each class, but these could be cached, so that no more than
a single instance is created per serialized class.  If a factory creates multiple instances
of its serializer, and those instances share state, then yes, they are responsible for coordinating
their state.  That seems reasonable and expected.

> X needs to be able to both create objects before deserializing them and take in a reference
to an object and initialize its member variables with deserialized data.

No.  It must be able to create instances, but it need not use a passed in reference.  As an
optimization it may use a reference passed in when optimized client code passes in non-null
references.  The implementation and use of references is optional.

> At this point, I think it's a gut call. If we feel that having clients not replicate
platform logic is more important than the restrictions we're providing on serialization platforms,
that's fine.

Yes, I think having clients not replicate platform logic is a mandate.  The framework should
maximally encapsulate serialization logic.  But I still don't see what onerous restrictions
this inflicts on serialization platforms.

> Add support for a general serialization mechanism for Map Reduce
> ----------------------------------------------------------------
>
>                 Key: HADOOP-1986
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1986
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Tom White
>            Assignee: Tom White
>             Fix For: 0.16.0
>
>         Attachments: SerializableWritable.java, serializer-v1.patch
>
>
> Currently Map Reduce programs have to use WritableComparable-Writable key-value pairs.
While it's possible to write Writable wrappers for other serialization frameworks (such as
Thrift), this is not very convenient: it would be nicer to be able to use arbitrary types
directly, without explicit wrapping and unwrapping.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message