hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tom White (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3788) Add serialization for Protocol Buffers
Date Thu, 18 Sep 2008 14:44:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632229#action_12632229

Tom White commented on HADOOP-3788:

bq. the OutputStream when serializing would need meta data included in it

I don't think we want to invent a new format here - this issue is to make serialization work
with existing formats, such as SequenceFile (or the new TFile, or HADOOP-4065).

As an experiment, I modified PBDeserializer to have a deserialize method that takes a length
(+in+ is now a CodedInputStream):

  public T deserialize(T t, int length) throws IOException {
    t = (t == null) ? (T) newInstance() : t;
    int limit = in.pushLimit(length);
    Message result =
    return (T) result;

I then modified TestPBSerializationIsolated to serialize two strings to the stream. When using
the deserialize method that doesn't take a length the test failed, but when I passed the length
the test succeeded.

So, I think we can do this without modifying Protocol Buffers. The change needed is the new
method on Deserializer (and Serializer?) that takes a length, and then changes in the framework
to call the new method when appropriate.

> Add serialization for Protocol Buffers
> --------------------------------------
>                 Key: HADOOP-3788
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3788
>             Project: Hadoop Core
>          Issue Type: Wish
>          Components: examples, mapred
>    Affects Versions: 0.19.0
>            Reporter: Tom White
>            Assignee: Alex Loddengaard
>             Fix For: 0.19.0
>         Attachments: hadoop-3788-v1.patch, hadoop-3788-v2.patch, protobuf-java-2.0.1.jar
> Protocol Buffers (http://code.google.com/p/protobuf/) are a way of encoding data in a
compact binary format. This issue is to write a ProtocolBuffersSerialization to support using
Protocol Buffers types in MapReduce programs, including an example program. This should probably
go into contrib. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message