hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod K V (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4322) Input/Output Format for TFile
Date Fri, 09 Apr 2010 17:31:50 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855483#action_12855483

Vinod K V commented on HADOOP-4322:

When this patch is tested internally, we found some problem - the job gives out negative "input
byes" count when using this outputformat. I don't have any context about this, but just updating
this bug with the analysis by Rahul..


ObjectFileRecordReader has getPos() method implementation , this method is giving incorrect
values for offset.

Code flow in the framework is like below.
      beforePos = getPos();
      //call to user's record reader 'next() method.
      afterPos = getPos();

//then for counter we do the following:
inputByteCounter.increment(afterPos - beforePos);//this is the counter which is 
                                                 //in question 

(ObjectFileRecordReader's getPos() method ) afterPos < beforePos , this is resulting in
the -ve increment to the counter.

So this patch shouldn't be committed as is without a relook.

> Input/Output Format for TFile
> -----------------------------
>                 Key: HADOOP-4322
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4322
>             Project: Hadoop Common
>          Issue Type: New Feature
>            Reporter: Amir Youssefi
>            Assignee: Amir Youssefi
>         Attachments: ObjectFileInputOutputFormat_1.patch
> Input/Output Format for TFile

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message