hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tom White (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1986) Add support for a general serialization mechanism for Map Reduce
Date Fri, 19 Oct 2007 19:28:51 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12536322

Tom White commented on HADOOP-1986:

> Clients which wish to reuse objects can, the first time, pass null.

Except there might not be enough type information to construct an object. For example if a
WritableSerializer were deserializing a LongWritable how would it know to create a LongWritable

> In the case of Writable, setOutput would just set the protected 'out' field of a DataOutputStream,
and this
> would all work fine. That could be instead done on each call to a 'serialize(Object,
OutputStream)' method,
> but perhaps its better to factor it out of inner loops. Is that the intent?

>From an API point of view I prefer serialize(Object, OutputStream), but it's not clear
to me that you can implement this efficiently for any serialization framework. For example,
I don't think the technique you describe of setting the 'out' field would work for Java Serialization.
And creating a new ObjectOutputStream for every call to the serialize(Object, OutputStream)
method would be prohibitive. So unless there's another way of getting round this then I think
we're stuck with stateful serializers. (I'd love to be proved wrong on this!)

> Add support for a general serialization mechanism for Map Reduce
> ----------------------------------------------------------------
>                 Key: HADOOP-1986
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1986
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Tom White
>            Assignee: Tom White
>             Fix For: 0.16.0
>         Attachments: SerializableWritable.java, serializer-v1.patch
> Currently Map Reduce programs have to use WritableComparable-Writable key-value pairs.
While it's possible to write Writable wrappers for other serialization frameworks (such as
Thrift), this is not very convenient: it would be nicer to be able to use arbitrary types
directly, without explicit wrapping and unwrapping.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message