hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xuefu Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-7370) Initial ground work for Hive on Spark [Spark branch]
Date Wed, 09 Jul 2014 03:38:04 GMT

     [ https://issues.apache.org/jira/browse/HIVE-7370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Xuefu Zhang updated HIVE-7370:
------------------------------

    Attachment: HIVE-7370.patch

> Initial ground work for Hive on Spark [Spark branch]
> ----------------------------------------------------
>
>                 Key: HIVE-7370
>                 URL: https://issues.apache.org/jira/browse/HIVE-7370
>             Project: Hive
>          Issue Type: Task
>            Reporter: Xuefu Zhang
>            Assignee: Xuefu Zhang
>         Attachments: HIVE-7370.patch
>
>
> Contribute PoC code to Hive on Spark as the ground work for subsequent tasks. While it
has hacks and bad organized code, it will change and more importantly it allows multiple people
to working on different components concurrently.
> With this, simple queries such as "select col from tab where ..." and "select grp, avg(val)
from tab group by grp where ..." can be executed on Spark.
> Contents of the patch:
> 1. code path for additional execution engine
> 2. essential classes such as SparkWork, SparkTask, SparkCompiler, HiveMapFunction, HiveReduceFunction,
SparkClient, etc.
> 3. Some code changes to existing classes.
> 4. build infrastructure
> 5. utility classes.
> To try run Hive on Spark, for now you need to have:
> 1. self-built Spark 1.0.0 with the patch attached.
> 2. invoke Hive client with environment variable MASTER, which points to master URL of
Spark.
> 2. set hive.execution.engine=spark
> 3. execute supported queries.
> NO PRECOMMIT TESTS. This is for spark branch only.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message