hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Namit Jain (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-3706) getBoolVar in FileSinkOperator can be optimized
Date Thu, 15 Nov 2012 08:42:12 GMT

     [ https://issues.apache.org/jira/browse/HIVE-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Namit Jain updated HIVE-3706:
-----------------------------

       Resolution: Fixed
    Fix Version/s: 0.10.0
     Hadoop Flags: Reviewed
           Status: Resolved  (was: Patch Available)

Committed. Thanks Kevin
                
> getBoolVar in FileSinkOperator can be optimized
> -----------------------------------------------
>
>                 Key: HIVE-3706
>                 URL: https://issues.apache.org/jira/browse/HIVE-3706
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>    Affects Versions: 0.10.0
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>             Fix For: 0.10.0
>
>         Attachments: HIVE-3706.1.patch.txt
>
>
> There's a call to HiveConf.getBoolVar in FileSinkOperator's processOp method.  In benchmarks
we found this call to be using ~2% of the CPU time on simple queries, e.g. INSERT OVERWRITE
TABLE t1 SELECT * FROM t2;
> This boolean value, a flag to collect the RawDataSize stat, won't change during the processing
of a query, so we can determine it at initialization and store that value, saving that CPU.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message