hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ferdinand Xu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-14919) Improve the performance of Hive on Spark 2.0.0
Date Tue, 08 Nov 2016 02:48:58 GMT

     [ https://issues.apache.org/jira/browse/HIVE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ferdinand Xu updated HIVE-14919:
--------------------------------
    Description: 
In HIVE-14029, we have updated Spark dependency to 2.0.0. We use Intel BigBench[1] to run
benchmark with Spark 2.0 over 1 TB data set comparing with Spark 1.6. We can see performance
improvments about 5.4% in general and 45% for the best case. However, some queries doesn't
have significant performance improvements.  This JIRA is the umbrella ticket addressing those
performance issues.

[1] https://github.com/intel-hadoop/Big-Data-Benchmark-for-Big-Bench

  was:
In HIVE-14029, we have updated Spark dependency to 2.0.0. We use Intel BigBench[1] to run
benchmark with Spark 2.0 over 10 GB data set comparing with Spark 1.6. We can see quite some
performance degradation for most of the queries for BigBench. For detailed information, please
see the attached file for detailed information. This JIRA is the umbrella ticket addressing
those performance issues.

[1] https://github.com/intel-hadoop/Big-Data-Benchmark-for-Big-Bench


> Improve the performance of Hive on Spark 2.0.0
> ----------------------------------------------
>
>                 Key: HIVE-14919
>                 URL: https://issues.apache.org/jira/browse/HIVE-14919
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ferdinand Xu
>            Assignee: Ferdinand Xu
>
> In HIVE-14029, we have updated Spark dependency to 2.0.0. We use Intel BigBench[1] to
run benchmark with Spark 2.0 over 1 TB data set comparing with Spark 1.6. We can see performance
improvments about 5.4% in general and 45% for the best case. However, some queries doesn't
have significant performance improvements.  This JIRA is the umbrella ticket addressing those
performance issues.
> [1] https://github.com/intel-hadoop/Big-Data-Benchmark-for-Big-Bench



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message