hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Markovitz, Dudu" <dmarkov...@paypal.com>
Subject RE: Optimized Hive query
Date Tue, 14 Jun 2016 06:52:13 GMT
Hi

You don’t need to do anything, the optimizer does it for you.
You can see that you get identical execution plans for the nested query and the flatten one.

Dudu


> create multiset table t (i int);

> explain select * from t;
+-------------------------------------------------------------------------------------------+--+
|                                          Explain                                       
  |
+-------------------------------------------------------------------------------------------+--+
| STAGE DEPENDENCIES:                                                                    
  |
|   Stage-0 is a root stage                                                              
  |
|                                                                                        
  |
| STAGE PLANS:                                                                           
  |
|   Stage: Stage-0                                                                       
  |
|     Fetch Operator                                                                     
  |
|       limit: -1                                                                        
  |
|       Processor Tree:                                                                  
  |
|         TableScan                                                                      
  |
|           alias: t                                                                     
  |
|           Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE 
  |
|           Select Operator                                                              
  |
|             expressions: i (type: int)                                                 
  |
|             outputColumnNames: _col0                                                   
  |
|             Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
 |
|             ListSink                                                                   
  |
|                                                                                        
  |
+-------------------------------------------------------------------------------------------+--+

> explain select * from (select * from (select * from (select * from (select * from t)
t) t) t) t;
+-------------------------------------------------------------------------------------------+--+
|                                          Explain                                       
  |
+-------------------------------------------------------------------------------------------+--+
| STAGE DEPENDENCIES:                                                                    
  |
|   Stage-0 is a root stage                                                              
  |
|                                                                                        
  |
| STAGE PLANS:                                                                           
  |
|   Stage: Stage-0                                                                       
  |
|     Fetch Operator                                                                     
  |
|       limit: -1                                                                        
  |
|       Processor Tree:                                                                  
  |
|         TableScan                                                                      
  |
|           alias: t                                                                     
  |
|           Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE 
  |
|           Select Operator                                                              
  |
|             expressions: i (type: int)                                                 
  |
|             outputColumnNames: _col0                                                   
  |
|             Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
 |
|             ListSink                                                                   
  |
|                                                                                        
  |
+-------------------------------------------------------------------------------------------+--+

From: Aviral Agarwal [mailto:aviral12028@gmail.com]
Sent: Monday, June 13, 2016 7:55 PM
To: user@hive.apache.org
Subject: Optimized Hive query

Hi,
I would like to know if there is a way to convert nested hive sub-queries into optimized queries.

For example :
INSERT INTO TABLE a.b SELECT * FROM ( SELECT c FROM d)

into

INSERT INTO TABLE a.b SELECT c FROM D

This is a simple example but the solution should apply is there were deeper nesting levels
present.

Thanks,
Aviral Agarwal

Mime
View raw message