hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jie Li (Commented) (JIRA)" <>
Subject [jira] [Commented] (HIVE-600) Running TPC-H queries on Hive
Date Sun, 11 Dec 2011 18:30:40 GMT


Jie Li commented on HIVE-600:

Hi all, we conducted TPC-H benchmark on Pig as well and compared with Hive. Overall Hive is
very efficient, but we find some of Hive's queries are suboptimal, especially for the order
of joins, e.g. it's better to do small joins first. That's probably why some of Hive's queries
were either super slow or failed (e.g. Q9 failed in our comparison, and was extremely slow
in Hadapt's comparison).

Our results are available at Hope they're
helpful to Hive as well.
> Running TPC-H queries on Hive
> -----------------------------
>                 Key: HIVE-600
>                 URL:
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Yuntao Jia
>            Assignee: Yuntao Jia
>         Attachments: TPC-H_on_Hive_2009-08-11.pdf, TPC-H_on_Hive_2009-08-11.tar.gz, TPC-H_on_Hive_2009-08-14.tar.gz
> The goal is to run all TPC-H ( benchmark queries on Hive for
two reasons. First, through those queries, we would like to find the new features that we
need to put into Hive so that Hive supports common SQL queries. Second, we would like to measure
the performance of Hive to find out what Hive is not good at. We can then improve Hive based
on those information. 
> For queries that are not supported now in Hive, I will try to rewrite them to one or
more Hive-supported queries. 

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message