hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1204) Re-factor InputFormat/RecordReader related classes
Date Wed, 11 Apr 2007 07:47:32 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12488015

Hadoop QA commented on HADOOP-1204:


2 attempts failed to build and test the latest attachment http://issues.apache.org/jira/secure/attachment/12355268/patch-1204.txt
against trunk revision r527100.

Please note that this message is automatically generated and may represent a problem with
the automation system and not the patch.

Results are at http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/23/console

> Re-factor InputFormat/RecordReader related classes
> --------------------------------------------------
>                 Key: HADOOP-1204
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1204
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Runping Qi
>         Assigned To: Runping Qi
>         Attachments: patch-1204.txt
> This Jira is the first small step to unify the code related to the inputformat/record
readers for streaming 
> with the Hadoop main framework.
> This Jira does a few things to clean up the related parts in the Hadoop main framework.
> 1. Add a constructor 
>        public LineRecordReader(Configuration job, FileSplit split)
> to LineRecordReader. This makes the constructors of both SequenceFileRecordReader and
> have the same signature. This facilitates to have a factory class to create various record
readers when 
> we bring in the class readers classes for hadoop streaming to the main framework.
> 2. Implementded next() method using the following newly added protected method to LineRecordReader
>      protected long readLine() throws IOException {
>          return LineRecordReader.readLine(in, buffer);
>      }
>     This allows the user to easily overwrite the readLine logic to use different line
breaker (e.g. treat '\r' as part of data, not line breaker).
> 3. Rename class InputFormatBase to FileInputFormat to better reflect the functionality
of the class.
> To keep backward compatible, still keep InputFormatBase class, but make it deprecated
shallow class simply inheriting FileInputFormat .
> 4. Change TextInputFormat and SequenceFileFormat to extend FileInputFormat.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message