hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Siying Dong (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-2146) Block Sampling should adjust number of reducers accordingly to make it useful
Date Tue, 03 May 2011 01:27:03 GMT
Block Sampling should adjust number of reducers accordingly to make it useful
-----------------------------------------------------------------------------

                 Key: HIVE-2146
                 URL: https://issues.apache.org/jira/browse/HIVE-2146
             Project: Hive
          Issue Type: Bug
            Reporter: Siying Dong


Now number of reducers of block sampling is not modified, so that queries like:
select c from tab tablesample(1 percent) group by c;
can generate huge number of reducers although the input is sampled to be small.
We need to shrink number of reducers to make block sampling more useful.
Since now number of reducers are determined before get splits, the way to do it probably is
not clean enough, but we can do a good guess.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message