hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Runping Qi (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-1204) Re-factor InputFormat/RecordReader related classes
Date Tue, 10 Apr 2007 18:15:32 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Runping Qi updated HADOOP-1204:

    Status: Patch Available  (was: Open)

a new patch fixing some comments and spurious spaces

> Re-factor InputFormat/RecordReader related classes
> --------------------------------------------------
>                 Key: HADOOP-1204
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1204
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Runping Qi
>         Assigned To: Runping Qi
>         Attachments: patch-1204.txt
> This Jira is the first small step to unify the code related to the inputformat/record
readers for streaming 
> with the Hadoop main framework.
> This Jira does a few things to clean up the related parts in the Hadoop main framework.
> 1. Add a constructor 
>        public LineRecordReader(Configuration job, FileSplit split)
> to LineRecordReader. This makes the constructors of both SequenceFileRecordReader and
> have the same signature. This facilitates to have a factory class to create various record
readers when 
> we bring in the class readers classes for hadoop streaming to the main framework.
> 2. Implementded next() method using the following newly added protected method to LineRecordReader
>      protected long readLine() throws IOException {
>          return LineRecordReader.readLine(in, buffer);
>      }
>     This allows the user to easily overwrite the readLine logic to use different line
breaker (e.g. treat '\r' as part of data, not line breaker).
> 3. Rename class InputFormatBase to FileInputFormat to better reflect the functionality
of the class.
> To keep backward compatible, still keep InputFormatBase class, but make it deprecated
shallow class simply inheriting FileInputFormat .
> 4. Change TextInputFormat and SequenceFileFormat to extend FileInputFormat.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message