hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "@Sanjiv Singh" <sanjiv.is...@gmail.com>
Subject Re: Optimize Hive Query
Date Thu, 23 Jun 2016 16:41:48 GMT
Hi Gopal,

I am using Tez as execution engine.

DAG :

+--------------------------------------------------------+--+
|
                                Explain
                                                                      |
+---------------------------------------------------------+--+
| Plan not optimized by CBO.
      |
|
                         |
| Vertex dependency in root stage
                          |
| Reducer 2 <- Map 1 (SIMPLE_EDGE)
                                                        |
|
                               |
| Stage-0
                            |
|    Fetch Operator
                                                            |
|       limit:-1
                                |
|       Stage-1
                              |
|          Reducer 2
                                        |
|          File Output Operator [FS_55596]
                                                      |
|             compressed:false
                                                           |
|             Statistics:Num rows: 6357592675 Data size: 54076899328 Basic
stats: COMPLETE Column stats: NONE          |
|
table:{"serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","input
format:":"org.apache.hadoop.mapred.TextInputFormat","output
format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"}  |
|             Select Operator [SEL_55594]
              |
|
 outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"]
                                   |
|                Statistics:Num rows: 6357592675 Data size: 54076899328
Basic stats: COMPLETE Column stats: NONE                       |
|                PTF Operator [PTF_55593]
                                 |
|                   Function definitions:[{"Input
definition":{"type:":"WINDOWING"}},{"partition by:":"_col0,
_col1","name:":"windowingtablefunction","order by:":"_col2"}]      |
|                   Statistics:Num rows: 6357592675 Data size: 54076899328
Basic stats: COMPLETE Column stats: NONE           |
|                   Select Operator [SEL_55592]
                                        |
|                   |
 outputColumnNames:["_col0","_col1","_col2","_col3","_col4","_col5","_col6"]
                                                                  |
|                   |  Statistics:Num rows: 6357592675 Data size:
54076899328 Basic stats: COMPLETE Column stats: NONE                    |
|                   |<-Map 1 [SIMPLE_EDGE] vectorized
                               |
|                      Reduce Output Operator [RS_55597]
                             |
|                         key expressions:m_d_key (type: smallint),
sb_gu_key (type: bigint), t_ev_st_dt (type: date)         |
|                         Map-reduce partition columns:m_d_key (type:
smallint), sb_gu_key (type: bigint)            |
|                         sort order:+++
     |
|                         Statistics:Num rows: 6357592675 Data size:
54076899328 Basic stats: COMPLETE Column stats: NONE
 |
|                         value expressions:ad_zn_key (type: int), c_dt
(type: date), e_p_dt (type: date), sq_nbr (type: int)           |
|                         TableScan [TS_55590]
                                                          |
|                            ACID table:true
                                         |
|                            alias:tuning_dd_key
                      |
|                            Statistics:Num rows: 6357592675 Data size:
54076899328 Basic stats: COMPLETE Column stats: NONE                      |
|

              |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+

Regards
Sanjiv Singh
Mob :  +091 9990-447-339

On Thu, Jun 23, 2016 at 2:45 AM, Gopal Vijayaraghavan <gopalv@apache.org>
wrote:

>
> > Long running query :
>
> Are you running this on MapReduce or Tez?
>
> Please post the output of explain - if you are seeing > 1 shuffle edge in
> your query while having only one window for OVER(), that might be the
> reason.
>
> OVER ( PARTITION BY  m_d_key , sb_gu_key  ORDER BY  t_ev_st_dt)
>
>
> The multiple PTF operators should have been collapsed by the reduce
> sink-deduplication.
>
> Cheers,
> Gopal
>
>
>

Mime
View raw message