hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tom White (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3787) Add serialization for Thrift
Date Wed, 03 Sep 2008 09:03:45 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627937#action_12627937
] 

Tom White commented on HADOOP-3787:
-----------------------------------

This, and HADOOP-1986 in general, does not mandate the use of SequenceFile. However, SequenceFiles
are a convenient binary format, so that's what's I've used here for the example.

It would be possible to run MapReduce against Thrift records in flat files with a suitable
InputFormat (which would need to be written), but such files would not be splittable (unless
there is some general way to find Thrift record boundaries from an arbitrary position in the
file). Unsplittable files do not in general play well with MapReduce and HDFS. Perhaps one
way to fix this is to insert a special Thrift record every n records whose unique byte sequence
can be scanned for to realign with the record boundaries. Could this work?

> Add serialization for Thrift
> ----------------------------
>
>                 Key: HADOOP-3787
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3787
>             Project: Hadoop Core
>          Issue Type: Wish
>          Components: examples, mapred
>            Reporter: Tom White
>         Attachments: hadoop-3787.patch, libthrift.jar
>
>
> Thrift (http://incubator.apache.org/thrift/) is cross-language serialization and RPC
framework. This issue is to write a ThriftSerialization to support using Thrift types in MapReduce
programs, including an example program. This should probably go into contrib.
> (There is a prototype implementation in https://issues.apache.org/jira/secure/attachment/12370464/hadoop-serializer-v2.tar.gz)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message