pig-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Pig Wiki] Update of "PigMultiQueryPerformanceSpecification" by RichardDing
Date Tue, 26 May 2009 23:50:44 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change notification.

The following page has been changed by RichardDing:
http://wiki.apache.org/pig/PigMultiQueryPerformanceSpecification

------------------------------------------------------------------------------
   
  L12 is the only multi-query script in the suite and we see 70% performance improvement.
For the non-multi-query scripts, the performance is unchanged. 
   
+ We also ran several multi-query tests on a 40-node Hadoop cluster against a text file with
200 million rows (each row had 3 columns). All tests used 40 reducers. These tests compared
the performance of the trunk (released version) with the multiquery branch (latest version).
Again, there were significant performance gains in the case that either all splittees had
combiners (Test Case 1&2) or no splittee had combiner (Test Case 3&4). There ware
also significant performance gains where some splittees were map-only (Test Case 5). The rest
results (average over 3 runs) are listed below.
  
+ || Test Case||Prior to Multiquery||MultiQuery ||Gain||Description||
+ || 1 ||311sec||172sec||44%||Merge 3 McR splittees into splitter and all splittees have combiners||
+ || 2 ||523sec||247sec||52%||Merge 5 McR splittees into splitter and all splittees have combiners||
+ || 3 ||549sec||264sec||51%||Merge 3 MR splittees into splitter and no splittee has combiner||
+ || 4 ||725sec||516sec||28%||Merge 5 MR splittees into splitter and no splittee has combiner||
+ || 5 ||203sec||114sec||43%||Merge 3 MR splittees into splitter and 2 splittees are map-only||
+ 

Mime
View raw message