hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yuntao Jia (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-600) Running TPC-H queries on Hive
Date Wed, 12 Aug 2009 02:28:14 GMT

    [ https://issues.apache.org/jira/browse/HIVE-600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742185#action_12742185
] 

Yuntao Jia commented on HIVE-600:
---------------------------------

To the 1st question, the reduce number is set in Hive. In particular, in Hive-default.xml,
one property is:

<property>
  <name>mapred.reduce.tasks</name>
  <value>-1</value>
    <description>The default number of reduce tasks per job.  Typically set
  to a prime close to the number of available hosts.  Ignored when
  mapred.job.tracker is "local". Hadoop set this to 1 by default, whereas hive uses -1 as
its default value.
  By setting this property to -1, Hive will automatically figure out what should be the number
of reducers.
  </description>
</property>


To the 2nd question, in the actual Hadoop configuration, we did use four paths. However, for
security reasons, we anonymized the configuration file and put one path instead.

Hope that answers your questions.


> Running TPC-H queries on Hive
> -----------------------------
>
>                 Key: HIVE-600
>                 URL: https://issues.apache.org/jira/browse/HIVE-600
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Yuntao Jia
>            Assignee: Yuntao Jia
>         Attachments: TPC-H_on_Hive_2009-08-11.pdf, TPC-H_on_Hive_2009-08-11.tar.gz
>
>
> The goal is to run all TPC-H (http://www.tpc.org/tpch/) benchmark queries on Hive for
two reasons. First, through those queries, we would like to find the new features that we
need to put into Hive so that Hive supports common SQL queries. Second, we would like to measure
the performance of Hive to find out what Hive is not good at. We can then improve Hive based
on those information. 
> For queries that are not supported now in Hive, I will try to rewrite them to one or
more Hive-supported queries. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message