hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tom White (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3788) Add serialization for Protocol Buffers
Date Wed, 03 Sep 2008 16:13:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12628035#action_12628035

Tom White commented on HADOOP-3788:

Alex, Thanks for looking at this.

It shouldn't be necessary to create a new Writable implementation for each protoc-generated
class (if that's what you are suggesting). By writing a ProtocolBuffersSerialization it should
be possible to avoid having to use Writables at all.

I imagined that the implementation of ProtocolBuffersSerializer would create a CodedOutputStream
in the open method, then call Message#writeTo with the CodedOutputStream in the serialize
method. ProtocolBuffersDeserializer is a bit more tricky. It would find the com.google.protobuf.Descriptors.Descriptor
for the message class being deserialized, then use DynamicMessage#parseFrom to construct a
message from the descriptor and the input stream.

To test this you could write some PB types to a Hadoop sequence file, then write a MapReduce
program to process it and write it out to another sequence file containing PB types. See HADOOP-3787.

> Add serialization for Protocol Buffers
> --------------------------------------
>                 Key: HADOOP-3788
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3788
>             Project: Hadoop Core
>          Issue Type: Wish
>          Components: examples, mapred
>            Reporter: Tom White
>            Assignee: Alex Loddengaard
> Protocol Buffers (http://code.google.com/p/protobuf/) are a way of encoding data in a
compact binary format. This issue is to write a ProtocolBuffersSerialization to support using
Protocol Buffers types in MapReduce programs, including an example program. This should probably
go into contrib. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message