hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ning Zhang (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HIVE-1348) Moving inputFileChanged() from ExecMapper to where it is needed
Date Wed, 02 Jun 2010 06:20:37 GMT

     [ https://issues.apache.org/jira/browse/HIVE-1348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Ning Zhang updated HIVE-1348:

    Attachment: HIVE-1348.4.patch

Since Yongqiang is tied up with other tasks, I'm uploading a new patch HIVE-1348.4.patch to
simplify the ExecMapperContext and the logic to check if the input file has been changed.

It differs from the previous version in the following way:

1) the lastInputFile in ExecMapperContext will be only modified by resetRow() which should
be called only once for each new row by the root of the operator tree -- ExecMapper.map().
It should not be changed by other operators throughout the operator tree. 

2) removed the variable inputFileChanged in ExecMapperContext and simplified the function
inputFileChanged() so that it can be called by any operator in the operator tree, and can
be called multiple times. 

3) the currentInputFile will be updated only by inputFileChanged(). If the function is not
called, the variable doesn't need to be updated.

> Moving inputFileChanged() from ExecMapper to where it is needed
> ---------------------------------------------------------------
>                 Key: HIVE-1348
>                 URL: https://issues.apache.org/jira/browse/HIVE-1348
>             Project: Hadoop Hive
>          Issue Type: Improvement
>            Reporter: Ning Zhang
>            Assignee: He Yongqiang
>         Attachments: hive-1348.1.patch, hive-1348.2.patch, hive-1348.3.patch, HIVE-1348.4.patch
> inputFileChanged() is only needed for Bucketed sort merge map join. It should not be
put in ExecMapper.map() where all code paths will hit this function. This function is quite
expensive since JobConf look up is a hash table look up. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message