hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4065) support for reading binary data from flat files
Date Tue, 16 Sep 2008 20:53:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12631544#action_12631544

Doug Cutting commented on HADOOP-4065:

Please don't edit descriptions.  It's very difficult to tell what's changed.  The description
should describe the problem.  The discussion below should present solutions.  Editing descriptions
and comments makes it very hard to follow an issue.  This is discussed in the "Jira Guidlines"
section of http://wiki.apache.org/hadoop/HowToContribute.

> support for reading binary data from flat files
> -----------------------------------------------
>                 Key: HADOOP-4065
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4065
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Joydeep Sen Sarma
>         Attachments: HADOOP-4065.0.txt, HADOOP-4065.1.txt, ThriftFlatFile.java
> Implement generic FlatFileDeserializationRecordReader which assumes a Serialization Implementation
is specific in the JobConf and that once instantiated, that Serialization Implementation can
 figure out the actual class being Deserialized from the JobConf.  e.g., the JobConf specifies
RecordIOSerialization and then the specific class is LogRecordObject. 
> Another way one might to do this is to use the SerializationFactory to do the lookup
of the Serialization Implementation; however, this requires all Serialization Implementations
to be known apriori and registered and goes against the spirit of a very generic FlatFileDeserializeRecordReader.
(see below re: adding Serialization implementations to contrib).
> To ensure it is generic, I propose implementing the following Serialization implementations:
> 1. RecordIOSerialization
> 2. LineReaderSerialization
> 3. ThriftSerialization
> The first 2 should go in io/serialization and the Thrift one in contrib somewhere. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message