hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mich Talebzadeh <mich.talebza...@gmail.com>
Subject Re: Optimize Hive Query
Date Thu, 23 Jun 2016 16:47:21 GMT
Do you also have the output from

desc formatted tuning_dd_key

 and send the output please?

Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 23 June 2016 at 17:41, @Sanjiv Singh <sanjiv.is.on@gmail.com> wrote:

> Hi Gopal,
>
> I am using Tez as execution engine.
>
> DAG :
>
> +--------------------------------------------------------+--+
> |
>                                   Explain
>                                                                         |
> +---------------------------------------------------------+--+
> | Plan not optimized by CBO.
>       |
> |
>                            |
> | Vertex dependency in root stage
>                             |
> | Reducer 2 <- Map 1 (SIMPLE_EDGE)
>                                                         |
> |
>                                  |
> | Stage-0
>                               |
> |    Fetch Operator
>                                                               |
> |       limit:-1
>                                 |
> |       Stage-1
>                                 |
> |          Reducer 2
>                                         |
> |          File Output Operator [FS_55596]
>                                                       |
> |             compressed:false
>                                                            |
> |             Statistics:Num rows: 6357592675 Data size: 54076899328 Basic
> stats: COMPLETE Column stats: NONE          |
> |
> table:{"serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","input
> format:":"org.apache.hadoop.mapred.TextInputFormat","output
> format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"}  |
> |             Select Operator [SEL_55594]
>                 |
> |
>  outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"]
>                                    |
> |                Statistics:Num rows: 6357592675 Data size: 54076899328
> Basic stats: COMPLETE Column stats: NONE                       |
> |                PTF Operator [PTF_55593]
>                                    |
> |                   Function definitions:[{"Input
> definition":{"type:":"WINDOWING"}},{"partition by:":"_col0,
> _col1","name:":"windowingtablefunction","order by:":"_col2"}]      |
> |                   Statistics:Num rows: 6357592675 Data size: 54076899328
> Basic stats: COMPLETE Column stats: NONE           |
> |                   Select Operator [SEL_55592]
>                                           |
> |                   |
>  outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6"]
>                                                                   |
> |                   |  Statistics:Num rows: 6357592675 Data size:
> 54076899328 Basic stats: COMPLETE Column stats: NONE                    |
> |                   |<-Map 1 [SIMPLE_EDGE] vectorized
>                                  |
> |                      Reduce Output Operator [RS_55597]
>                              |
> |                         key expressions:m_d_key (type: smallint),
> sb_gu_key (type: bigint), t_ev_st_dt (type: date)         |
> |                         Map-reduce partition columns:m_d_key (type:
> smallint), sb_gu_key (type: bigint)            |
> |                         sort order:+++
>      |
> |                         Statistics:Num rows: 6357592675 Data size:
> 54076899328 Basic stats: COMPLETE Column stats: NONE
>  |
> |                         value expressions:ad_zn_key (type: int), c_dt
> (type: date), e_p_dt (type: date), sq_nbr (type: int)           |
> |                         TableScan [TS_55590]
>                                                           |
> |                            ACID table:true
>                                          |
> |                            alias:tuning_dd_key
>                       |
> |                            Statistics:Num rows: 6357592675 Data size:
> 54076899328 Basic stats: COMPLETE Column stats: NONE                      |
> |
>
>                 |
>
> +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
>
> Regards
> Sanjiv Singh
> Mob :  +091 9990-447-339
>
> On Thu, Jun 23, 2016 at 2:45 AM, Gopal Vijayaraghavan <gopalv@apache.org>
> wrote:
>
>>
>> > Long running query :
>>
>> Are you running this on MapReduce or Tez?
>>
>> Please post the output of explain - if you are seeing > 1 shuffle edge in
>> your query while having only one window for OVER(), that might be the
>> reason.
>>
>> OVER ( PARTITION BY  m_d_key , sb_gu_key  ORDER BY  t_ev_st_dt)
>>
>>
>> The multiple PTF operators should have been collapsed by the reduce
>> sink-deduplication.
>>
>> Cheers,
>> Gopal
>>
>>
>>
>

Mime
View raw message