hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-9010) introduce Hive perf configuration utility
Date Wed, 03 Dec 2014 02:03:12 GMT

     [ https://issues.apache.org/jira/browse/HIVE-9010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sergey Shelukhin updated HIVE-9010:
-----------------------------------
    Description: 
Recently, many major perf features have been added (or are being added) to Hive, such as vectorization,
CBO, Tez, Spark, etc.
These are off by default, and customers using the Apache distribution may not be aware of
them, and may not take advantage of all the speed Hive can offer. 

We can create a Hive perf configuration utility that will be able to set 6-10 important, easy-to-set
settings. It can be used by admins or users when deploying Hive or on an existing cluster.
Ideally all the no-brainer set-to-true settings would be there, with caveats, if any, described;
some other ones may be, too, but we don't want to add any options for tuning because the whole
point is to make it not confusing (as compared to editing the entire config file). Unless
we have automatic tuning at some point, the users doing perf tuning can edit the config file
manually after reading the docs.

Then we can mention it prominently in the docs and release notes. This should go a long way
towards making sure users can utilize Hive to its full potential, without us enabling large/perf
features by default, at least until they are stable (e.g. CBO can be enabled by default, so
this tool may note that).

Experimental feature settings (true/false or simple) can also be added in a separate section.

  was:
Recently, many major perf features have been added (or are being added) to Hive, such as vectorization,
CBO, Tez, Spark, etc.
These are off by default, and customers using the Apache distribution may not be aware of
them, and may not take advantage of all the speed Hive can offer. That may also apply to other
config settings.

We can create a Hive perf configuration utility that will be able to set 6-10 important, easy-to-set
settings. It can be used by admins or users when deploying Hive or on an existing cluster.
Ideally all the no-brainer set-to-true settings would be there, with caveats, if any, described;
some other ones may be, too, but we don't want to add any options for tuning because the whole
point is to make it not confusing (as compared to editing the entire config file). Unless
we have automatic tuning at some point, the users doing perf tuning can edit the config file
manually after reading the docs.

Then we can mention it prominently in the docs and release notes. This should go a long way
towards making sure users can utilize Hive to its full potential, without us enabling large/perf
features by default, at least until they are stable (e.g. CBO can be enabled by default, so
this tool may note that).

Experimental feature settings (true/false or simple) can also be added in a separate section.


> introduce Hive perf configuration utility
> -----------------------------------------
>
>                 Key: HIVE-9010
>                 URL: https://issues.apache.org/jira/browse/HIVE-9010
>             Project: Hive
>          Issue Type: Improvement
>          Components: Configuration
>            Reporter: Sergey Shelukhin
>
> Recently, many major perf features have been added (or are being added) to Hive, such
as vectorization, CBO, Tez, Spark, etc.
> These are off by default, and customers using the Apache distribution may not be aware
of them, and may not take advantage of all the speed Hive can offer. 
> We can create a Hive perf configuration utility that will be able to set 6-10 important,
easy-to-set settings. It can be used by admins or users when deploying Hive or on an existing
cluster. Ideally all the no-brainer set-to-true settings would be there, with caveats, if
any, described; some other ones may be, too, but we don't want to add any options for tuning
because the whole point is to make it not confusing (as compared to editing the entire config
file). Unless we have automatic tuning at some point, the users doing perf tuning can edit
the config file manually after reading the docs.
> Then we can mention it prominently in the docs and release notes. This should go a long
way towards making sure users can utilize Hive to its full potential, without us enabling
large/perf features by default, at least until they are stable (e.g. CBO can be enabled by
default, so this tool may note that).
> Experimental feature settings (true/false or simple) can also be added in a separate
section.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message