hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Prasanth Jayachandran <pjayachand...@hortonworks.com>
Subject Re: How to measure the execution time of query on Hive on Tez
Date Thu, 12 Oct 2017 22:00:45 GMT
Hi

There are couple of tables that you are seeing, the top one is Query Execution Summary and
next one is Task Execution Summary.

At the end of your query execution, you should see something like "Time taken: xx.xx seconds”
in hive CLI or beeline. This represents the overall time the query took from start (start
of semantic analysis) to finish (end of fetching results).

This overall time is broken down in the Query Execution Summary table. When you submit the
query, query goes through multiple phases before returning the results. 
The time for each of these phases is defined in Query Execution Summary tables. 

If you want the query runtime then you should look at the last line in hive CLI or beeline
which represents the end-to-end time.
If you are just interested in how long the query actually executed in the cluster then you
should look at Run DAG time.

Run DAG is NOT sum of duration of all vertices as vertices can be scheduled in parallel. 
Run DAG is the time that Hive computes when it knows that DAG has been accepted and started
by Application Master until the completion of DAG (could be failed, errored or succeeded).

Hope this helps.

Thanks
Prasanth

> On Oct 12, 2017, at 2:46 PM, Zhang, Liyun <liyun.zhang@intel.com> wrote:
> 
> Hi all:
>   Maybe in last mail the attached picture is not shown.
> I re-described my question here.  I saw following statistics about the runtime when running
query.
> 
> The Run DAG is 318s.  But it is not the sum of DURATION of all VERTICES((59549+4069+3055+3055+1004+1006+132736+34248+11077+1003+439+140896+35260+8070)/1000=435s
> Not the sum of CPU_TIME.   There are several indicator "RUN DAG","DURATION","CPU_TIME",
 which indicator I should use when measure the performance? Sometimes I found there is significant
improvement in sum of (CPU_TIME) while there is no significant improvement in "RUN DAG". 
Is this normal?  Appreciate to get some feedback from you!
> 
> 
> 
> 2017-10-12T16:29:39,262  INFO [main] SessionState: ----------------------------------------------------------------------------------------------
> 2017-10-12T16:29:39,262  INFO [main] SessionState: OPERATION                        
   DURATION
> 2017-10-12T16:29:39,262  INFO [main] SessionState: ----------------------------------------------------------------------------------------------
> 2017-10-12T16:29:39,263  INFO [main] SessionState: Compile Query                    
      3.72s
> 2017-10-12T16:29:39,263  INFO [main] SessionState: Prepare Plan                     
      0.60s
> 2017-10-12T16:29:39,263  INFO [main] SessionState: Submit Plan                      
      0.61s
> 2017-10-12T16:29:39,263  INFO [main] SessionState: Start DAG                        
      0.52s
> 2017-10-12T16:29:39,263  INFO [main] SessionState: Run DAG                          
    318.54s
> 2017-10-12T16:29:39,263  INFO [main] SessionState: ----------------------------------------------------------------------------------------------
> 2017-10-12T16:29:39,263  INFO [main] SessionState:
> 2017-10-12T16:29:39,289  INFO [cea2258c-aa47-46a1-af5b-39860a6edbb3 main] counters.Limits:
Counter limits initialized with parameters:  GROUP_NAME_MAX=256, MAX_GROUPS=500, COUNTER_NAME_MAX=64,
MAX_COUNTERS=2000
> 2017-10-12T16:29:39,294  INFO [main] SessionState: Task Execution Summary
> 2017-10-12T16:29:39,294  INFO [main] SessionState: ----------------------------------------------------------------------------------------------
> 2017-10-12T16:29:39,294  INFO [main] SessionState:   VERTICES      DURATION(ms)   CPU_TIME(ms)
   GC_TIME(ms)   INPUT_RECORDS   OUTPUT_RECORDS
> 2017-10-12T16:29:39,294  INFO [main] SessionState: ----------------------------------------------------------------------------------------------
> 2017-10-12T16:29:39,298  INFO [main] SessionState:      Map 1          59549.00     
1,355,520         28,565     550,076,554    1,602,119,842
> 2017-10-12T16:29:39,300  INFO [main] SessionState:     Map 12           4069.00     
   15,670            522          73,049              732
> 2017-10-12T16:29:39,300  INFO [main] SessionState:     Map 13           3055.00     
   14,030            567             212              212
> 2017-10-12T16:29:39,301  INFO [main] SessionState:     Map 14           3055.00     
   13,820            606             212              212
> 2017-10-12T16:29:39,303  INFO [main] SessionState: Reducer 10           1004.00     
   13,450            265               4                4
> 2017-10-12T16:29:39,305  INFO [main] SessionState: Reducer 11           1006.00     
    4,290             71             216              212
> 2017-10-12T16:29:39,307  INFO [main] SessionState:  Reducer 2         132736.00     
2,362,160         83,029     537,120,745      107,740,258
> 2017-10-12T16:29:39,308  INFO [main] SessionState:  Reducer 3          34248.00     
  643,350         20,661     107,740,470              203
> 2017-10-12T16:29:39,310  INFO [main] SessionState:  Reducer 4          11077.00     
   77,020          1,496             203               31
> 2017-10-12T16:29:39,311  INFO [main] SessionState:  Reducer 5           1003.00     
   40,030            824              10               10
> 2017-10-12T16:29:39,312  INFO [main] SessionState:  Reducer 6            439.00     
      590              0              10                0
> 2017-10-12T16:29:39,314  INFO [main] SessionState:  Reducer 7         140896.00     
1,925,760         52,784     537,120,745      107,740,258
> 2017-10-12T16:29:39,316  INFO [main] SessionState:  Reducer 8          35260.00     
  590,200         22,331     107,740,470               76
> 2017-10-12T16:29:39,318  INFO [main] SessionState:  Reducer 9           8070.00     
   24,630            249              76                4
> 2017-10-12T16:29:39,318  INFO [main] SessionState: -------------------------------------------------------------------
> 
> From: Zhang, Liyun [mailto:liyun.zhang@intel.com]
> Sent: Thursday, October 12, 2017 4:40 PM
> To: dev@hive.apache.org
> Subject: How to measure the execution time of query on Hive on Tez
> 
> Hi  all:
>  Anyone knows how to view the detail execution time of every map/reduce task in hive
on tez?
> I screenshot the result:
> Run DAG is  324.s . But this is not the sum of DURATION time of every tasks( 665=163+22+1+1+2+3+143+31+4+0+254+29+8+2+1+1).
 So which parameter DURATION(ms) or CPU_TIME(ms) should be used?
> [cid:image001.png@01D34378.B0FDB5B0]
> 
> Appreciate to get some feedback from you!
> 
> 
> Best Regards
> Kelly Zhang/Zhang,Liyun
> 

Mime
View raw message