hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gopal V (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-4078) Remove the serialize-deserialize pair in CommonJoinResolver
Date Wed, 27 Feb 2013 02:15:12 GMT

    [ https://issues.apache.org/jira/browse/HIVE-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13587902#comment-13587902
] 

Gopal V commented on HIVE-4078:
-------------------------------

Checked the plans generated by both

-Hive history file=/tmp/root/hive_job_log_root_201302262043_1955781116.txt
+Hive history file=/tmp/root/hive_job_log_root_201302262041_240390200.txt
 SLF4J: Class path contains multiple SLF4J bindings.
 SLF4J: Found binding in [jar:file:/opt/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
 SLF4J: Found binding in [jar:file:/root/ab/hive/build/dist/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
@@ -240,7 +240,7 @@
   Stage: Stage-5
     Map Reduce
       Alias -> Map Operator Tree:
-        hdfs://ip-10-115-29-66.ec2.internal:56565/tmp/hive-root/hive_2013-02-26_20-43-25_902_2504596014266294377/-mr-10005

+        hdfs://ip-10-115-29-66.ec2.internal:56565/tmp/hive-root/hive_2013-02-26_20-41-58_732_1810850078273935194/-mr-10005

             Reduce Output Operator
               key expressions:
                     expr: _col0
@@ -303,7 +303,7 @@
   Stage: Stage-6
     Map Reduce
       Alias -> Map Operator Tree:
-        hdfs://ip-10-115-29-66.ec2.internal:56565/tmp/hive-root/hive_2013-02-26_20-43-25_902_2504596014266294377/-mr-10006

+        hdfs://ip-10-115-29-66.ec2.internal:56565/tmp/hive-root/hive_2013-02-26_20-41-58_732_1810850078273935194/-mr-10006

             Reduce Output Operator
               key expressions:
                     expr: _col0
@@ -340,7 +340,7 @@
       limit: 100
 
 
-Time taken: 3.374 seconds, Fetched: 283 row(s)
+Time taken: 7.795 seconds, Fetched: 283 row(s)

Seems good.
                
> Remove the serialize-deserialize pair in CommonJoinResolver
> -----------------------------------------------------------
>
>                 Key: HIVE-4078
>                 URL: https://issues.apache.org/jira/browse/HIVE-4078
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Gopal V
>            Assignee: Gopal V
>         Attachments: HIVE-4078.patch
>
>
> CommonJoinProcessor tries to clone a MapredWork while attempting a conversion to a map-join
> {code}
>   // deep copy a new mapred work from xml
>   InputStream in = new ByteArrayInputStream(xml.getBytes("UTF-8"));
>   MapredWork newWork = Utilities.deserializeMapRedWork(in, physicalContext.getConf());
> {code}
> which is a very heavy operation memory wise & cpu-wise.
> Instead of cloning via XMLEncoder, it is faster to use BeanUtils.cloneBean() which is
following same data paths (get/set bean methods) instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message