hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mich Talebzadeh <mich.talebza...@gmail.com>
Subject Re: Disable Hive autogather optimization
Date Fri, 29 Apr 2016 16:16:19 GMT
Hi
Is this what is detailed in the following Jira
<https://issues.apache.org/jira/browse/HIVE-11160>

Description

Hive will collect table stats when set hive.stats.autogather=true during
the INSERT OVERWRITE command. And then the users need to collect the column
stats themselves using "Analyze" command. In this patch, the column stats
will also be collected automatically. More specifically, INSERT OVERWRITE
will automatically create new column stats. INSERT INTO will automatically
merge new column stats with existing ones.


Ok the issue you are having is when INSERT OVERWRITE operation is involved
in an existing table, then column stats kicks in and that adds to timing
process?


Sounds like it is a  general feature and can be disabled as part of table
struct.





Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 29 April 2016 at 00:12, Udit Mehta <umehta@groupon.com> wrote:

> Any insights on this?
>
> On Tue, Apr 26, 2016 at 7:32 PM, Udit Mehta <umehta@groupon.com> wrote:
>
>> Update: Realized this works if we create a fresh table with this config
>> already disabled but does not work if there is already a table created when
>> this config was enabled. We now need to figure out how to disable this
>> config for a table created when this config was true.
>>
>> On Tue, Apr 26, 2016 at 6:16 PM, Udit Mehta <umehta@groupon.com> wrote:
>>
>>> Hive version we are using is 1.2.1.
>>>
>>> On Tue, Apr 26, 2016 at 6:01 PM, Udit Mehta <umehta@groupon.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> We need to disable the Hive autogather stats optimization by disabling "
>>>> *hive.stats.autogather*" but for some reason, the config change doesnt
>>>> seem to go through. We modified this config in the hive-site.xml and
>>>> restarted the Hive metastore. We also made this change explicitly in the
>>>> job but it doesnt seem to help.
>>>>
>>>>
>>>>
>>>> *set hive.stats.autogather=false;*
>>>> Does anyone know the right way to disable this config since we dont
>>>> want to compute stats in out jobs.
>>>>
>>>> Thanks,
>>>> Udit
>>>>
>>>
>>>
>>
>

Mime
View raw message