hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Dyer (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3788) Add serialization for Protocol Buffers
Date Thu, 20 Nov 2008 22:16:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649503#action_12649503

Chris Dyer commented on HADOOP-3788:

Other pieces of my system are using protocol buffers already, so I'm stuck with it for the
pieces that have to interact with Hadoop.  I am currently using HadoopStreaming, but the hoops
I jump through are quite extensive- I serialize the protocol buffer to byte arrays and then
encode them using base64 so that they can be put into Streaming's text format's key-value
pair encoding, where the key is separated from the value by a tab and the record is terminated
with a newline.  These extra layers aren't really a problem since what i'm computing is computationally
quite expensive (i could serialize to XML and it would be just a drop in the bucket in terms
of running time).  But, it does complicate my code in ways I think should be unnecessary.
 There is a paucity of information for how to use streaming with non-text data, so I haven't
really been able to figure out of there's an easier way to do all of this.

> Add serialization for Protocol Buffers
> --------------------------------------
>                 Key: HADOOP-3788
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3788
>             Project: Hadoop Core
>          Issue Type: Wish
>          Components: contrib/serialization, examples, mapred
>            Reporter: Tom White
>            Assignee: Alex Loddengaard
>             Fix For: 0.20.0
>         Attachments: hadoop-3788-v1.patch, hadoop-3788-v2.patch, hadoop-3788-v3.patch,
protobuf-java-2.0.1.jar, protobuf-java-2.0.2.jar
> Protocol Buffers (http://code.google.com/p/protobuf/) are a way of encoding data in a
compact binary format. This issue is to write a ProtocolBuffersSerialization to support using
Protocol Buffers types in MapReduce programs, including an example program. This should probably
go into contrib. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message