hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xuefu Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-7527) Support order by and sort by on Spark
Date Sun, 27 Jul 2014 22:33:38 GMT
Xuefu Zhang created HIVE-7527:
---------------------------------

             Summary: Support order by and sort by on Spark
                 Key: HIVE-7527
                 URL: https://issues.apache.org/jira/browse/HIVE-7527
             Project: Hive
          Issue Type: Sub-task
          Components: Spark
            Reporter: Xuefu Zhang


Currently Hive depends completely on MapReduce's sorting as part of shuffling to achieve order
by (global sort, one reducer) and sort by (local sort).
Spark has a sort by transformation in different variations that can used to support Hive's
order by and sort by. However, we still need to evaluate weather Spark's sortBy can achieve
the same functionality inherited from MapReduce's shuffle sort.

Currently Hive on Spark should be able to run simple sort by or order by, by changing the
currently partitionBy to sortby. This is the way to verify theories. Complete solution will
not be available until we have complete SparkPlanGenerator.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message