hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Foley (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-9168) The Naming and Inheritance for RecordReader, LineRecordReader, LineReader
Date Mon, 06 May 2013 07:16:17 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-9168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13649556#comment-13649556

Matt Foley commented on HADOOP-9168:

I would suggest we not engage in API renaming in Hadoop-1.x.
By all means proceed in Hadoop-2.x if desired by the community.

I'm open to other opinions on this, but that's mine.  
Consider it a "-0", not a "-1".
Removing 1.2.0 from the "fixVersion" list for now, but
if you object, please put it back and we'll continue the discussion.

> The Naming and Inheritance for RecordReader, LineRecordReader, LineReader 
> --------------------------------------------------------------------------
>                 Key: HADOOP-9168
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9168
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: util
>    Affects Versions: 0.21.0, 2.0.2-alpha, 0.23.5
>            Reporter: Gelesh
>            Priority: Minor
>              Labels: Hadoop, InputFormat
>             Fix For: site, hudson, 1.2.0, 0.23.2
>   Original Estimate: 96h
>  Remaining Estimate: 96h
> I feel LineReader is not the correct name, since it reads up to a given delimiter.
> How about Text Record Reader ?
> Sounds correct but LineReader is not a RecordReader by inheritance,
> but by functionality , yes it is the Record reader.
> Now if we look at it with a different angle,
> In General,
> InputFormat would mostly has two responsibilities
> 1)To Read A split
> 2)Generate Key & Value pairs based upon the Reading done over Split.
> Now in TextInputFormat,
> Has a RecordReader, Which is inherited by LineRecordReader, 
> which uses another class LineReader.
> But We Have
> LineReader, which does the reading of the file.
> LineRecordReader generates key & Value. 
> I would suggest,
> RecordReader      to be renamed as     KeyValueGenerator,
> LineRecordReader  to be renamed as     TextInputKeyValueGenerator,
> LineReader        to be renamed as     delimitedTextReader,
> Generic attributes of LineReader (such as start, pos, end, buffer, bufferBytes .. etc
) to be abstracted to a class called RecordReader,
> Since its all specific to reading of the given input.
> delimitedTextReader class could extend RecordReader.
> Now the names could make better scene. We must also look into computability as well.
It might be un fit to deploy unless a new API is introduced.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message