hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Douglas (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-2603) SequenceFileAsBinaryInputFormat
Date Tue, 15 Jan 2008 00:24:34 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-2603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Chris Douglas updated HADOOP-2603:

    Attachment: 2603-1.patch

bq. Unless one knows the Writable serialization, one cannot use this, right ?

Pretty much. There are getKeyClassName and getValueClassName methods on SequenceFileAsBinaryRecordReader
that return names of the key and value classes. If the reader doesn't have- or doesn't care
about- the classes associated with the bytes they're reading from a SequenceFile, then this
should permit them to read records without interpreting them through Writables or loading
the key/value classes. Sampling records is a good example.

This updated patch effects this (with changes to SequenceFile.Reader that defer key/value
classloading until required) and adds documentation missing from the former patch.

> SequenceFileAsBinaryInputFormat
> -------------------------------
>                 Key: HADOOP-2603
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2603
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Chris Douglas
>            Assignee: Chris Douglas
>             Fix For: 0.16.0
>         Attachments: 2603-0.patch, 2603-1.patch
> Add an InputFormat to read the raw bytes as keys, values from a SequenceFile

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message