hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Baldeschwieler <eri...@yahoo-inc.com>
Subject Re: [jira] Commented: (HADOOP-433) Better access to the RecordReader
Date Wed, 09 Aug 2006 19:21:10 GMT
Why not provide a pointer to the real record reader?  Seems like a  
valid OO way to get access to all kinds of things.

On Aug 8, 2006, at 3:48 PM, Owen O'Malley (JIRA) wrote:

>     [ http://issues.apache.org/jira/browse/HADOOP-433? 
> page=comments#action_12426763 ]
>
> Owen O'Malley commented on HADOOP-433:
> --------------------------------------
>
> This is largely addressed by the extensions I put into the JobConf  
> task localization code. Look at MapTask.localizeConfiguration.
>
> In particular, each Mapper has available to it:
> map.input.file
> map.input.start
> map.input.length
>
> For application writers that don't want to read the Hadoop code,  
> I've put the list of attributes in:
> http://wiki.apache.org/lucene-hadoop/TaskExecutionEnvironment
>
> This will let you get an equivalent RecordReader even if it is not  
> the same object. Will that address your problem?
>
>> Better access to the RecordReader
>> ---------------------------------
>>
>>                 Key: HADOOP-433
>>                 URL: http://issues.apache.org/jira/browse/HADOOP-433
>>             Project: Hadoop
>>          Issue Type: Improvement
>>          Components: mapred
>>    Affects Versions: 0.5.0
>>            Reporter: Benjamin Reed
>>            Priority: Minor
>>
>> The record reader has access to the FileSplit which can in turn  
>> have information that is useful to the Mapper. For example, Map  
>> processing may vary according to file name or attributes  
>> associated with a file. Unfortunately, even using a MapRunner you  
>> only have access to the progress wrapper of the RecordReader. To  
>> get access to the real record reader I had to use a thread local  
>> variable which I set in RecordReader.getNext(). It would be much  
>> nicer if you could get a reference to the real RecordReader from  
>> the RecordReader passed to MapRunner.
>
> -- 
> This message is automatically generated by JIRA.
> -
> If you think it was sent incorrectly contact one of the  
> administrators: http://issues.apache.org/jira/secure/ 
> Administrators.jspa
> -
> For more information on JIRA, see: http://www.atlassian.com/ 
> software/jira
>
>


Mime
View raw message