hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ferdinand Xu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-14919) Improve the performance of Hive on Spark 2.0.0
Date Mon, 10 Oct 2016 05:56:20 GMT

     [ https://issues.apache.org/jira/browse/HIVE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ferdinand Xu updated HIVE-14919:
--------------------------------
    Attachment: benchmark.xlsx

> Improve the performance of Hive on Spark 2.0.0
> ----------------------------------------------
>
>                 Key: HIVE-14919
>                 URL: https://issues.apache.org/jira/browse/HIVE-14919
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ferdinand Xu
>            Assignee: Ferdinand Xu
>         Attachments: benchmark.xlsx
>
>
> In HIVE-14029, we have updated Spark dependency to 2.0.0. We use Intel BigBench[1] to
run benchmark over 10 GB data set comparing with Spark 1.6. We can see quite some performance
degradations for all the queries of BigBench. For detailed information, please see the attached
files. This JIRA is the umbrella ticket addressing those performance issues.
> [1] https://github.com/intel-hadoop/Big-Data-Benchmark-for-Big-Bench



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message