hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hive/JoinOptimization" by LiyinTang
Date Tue, 30 Nov 2010 23:15:39 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hive/JoinOptimization" page has been changed by LiyinTang.


  In this case, the query processor will launch the original Common Join task as a Backup
Task to run, which is totally transparent to user. The basic idea is shown as Fig 7.
  == 2.4 Performance Evaluation ==
+ Here are some performance comparison results. All the benchmark queries here can be converted
into Map Join.
+ '''Table 2: The Comparison between the previous join with the new optimized join'''
+ ''' {{attachment:fig8.jpg}} '''
+ For the previous common join, the experiment only calculates the average time of  map reduce
task execution time. Because job finish time will include the job scheduling overhead. Sometimes
it will wait for some time to start to run the job in the cluster. Also for the new optimized
common join, the experiment only adds up the average time of local task execution time with
the average time of map reduce execution time. So both of the results should avoid the job
scheduling overhead.
+ From the result, if the new common join can be converted into map join, it will get 57%
~163 % performance improvement.

View raw message