hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tom White (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1986) Add support for a general serialization mechanism for Map Reduce
Date Fri, 19 Oct 2007 19:28:51 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12536322
] 

Tom White commented on HADOOP-1986:
-----------------------------------

> Clients which wish to reuse objects can, the first time, pass null.

Except there might not be enough type information to construct an object. For example if a
WritableSerializer were deserializing a LongWritable how would it know to create a LongWritable
object?

> In the case of Writable, setOutput would just set the protected 'out' field of a DataOutputStream,
and this
> would all work fine. That could be instead done on each call to a 'serialize(Object,
OutputStream)' method,
> but perhaps its better to factor it out of inner loops. Is that the intent?

>From an API point of view I prefer serialize(Object, OutputStream), but it's not clear
to me that you can implement this efficiently for any serialization framework. For example,
I don't think the technique you describe of setting the 'out' field would work for Java Serialization.
And creating a new ObjectOutputStream for every call to the serialize(Object, OutputStream)
method would be prohibitive. So unless there's another way of getting round this then I think
we're stuck with stateful serializers. (I'd love to be proved wrong on this!)

> Add support for a general serialization mechanism for Map Reduce
> ----------------------------------------------------------------
>
>                 Key: HADOOP-1986
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1986
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Tom White
>            Assignee: Tom White
>             Fix For: 0.16.0
>
>         Attachments: SerializableWritable.java, serializer-v1.patch
>
>
> Currently Map Reduce programs have to use WritableComparable-Writable key-value pairs.
While it's possible to write Writable wrappers for other serialization frameworks (such as
Thrift), this is not very convenient: it would be nicer to be able to use arbitrary types
directly, without explicit wrapping and unwrapping.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message