kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Boris Tyukin <bo...@boristyukin.com>
Subject first and second run 2x query time difference
Date Wed, 13 Dec 2017 16:07:10 GMT
Hi guys,

I am doing some benchmarks with Kudu and Impala/Parquet and hope to share
it soon but there is one thing that bugs me. This is perhaps Impala
question but since I am using Kudu with Impala I am going to try and ask
anyway.

One of my queries takes 120 seconds to run the very first time. It joins
one large 5B row table with a bunch of smaller tables and then stores
result in Impala/parquet (not Kudu).

Now if I run it second and third time, it only takes 60 seconds. Can
someone explain why? Is there any settings to decrease this gap?

I've compared query profiles in CM and the only thing that was very
different is scan against Kudu table (the large one):

***************************
first time:
***************************
KUDU_SCAN_NODE (id=0) (47.68s)
<https://lkmaorabd103.multihosp.net:7183/cmf/impala/queryDetails?queryId=5143f7165be82819%3Ae00a103500000000&serviceName=impala#>



   - BytesRead: *0 B*
   - InactiveTotalTime: *0ns*
   - KuduRemoteScanTokens: *0*
   - NumScannerThreadsStarted: *20*
   - PeakMemoryUsage: *35.8 MiB*
   - RowsRead: *693,502,241*
   - RowsReturned: *693,502,241*
   - RowsReturnedRate: *14643448 per second*
   - ScanRangesComplete: *20*
   - ScannerThreadsInvoluntaryContextSwitches: *1,341*
   - ScannerThreadsTotalWallClockTime: *36.2m*
      - MaterializeTupleTime(*): *47.57s*
      - ScannerThreadsSysTime: *31.42s*
      - ScannerThreadsUserTime: *1.7m*
   - ScannerThreadsVoluntaryContextSwitches: *96,855*
   - TotalKuduScanRoundTrips: *52,308*
   - TotalReadThroughput: *0 B/s*
   - TotalTime: *47.68s*


***************************
second time:
***************************
KUDU_SCAN_NODE (id=0) (4.28s)
<https://lkmaorabd103.multihosp.net:7183/cmf/impala/queryDetails?queryId=53497a308f860837%3A243772e000000000&serviceName=impala#>



   - BytesRead: *0 B*
   - InactiveTotalTime: *0ns*
   - KuduRemoteScanTokens: *0*
   - NumScannerThreadsStarted: *20*
   - PeakMemoryUsage: *37.9 MiB*
   - RowsRead: *693,502,241*
   - RowsReturned: *693,502,241*
   - RowsReturnedRate: *173481534 per second*
   - ScanRangesComplete: *20*
   - ScannerThreadsInvoluntaryContextSwitches: *1,451*
   - ScannerThreadsTotalWallClockTime: *19.5m*
      - MaterializeTupleTime(*): *4.20s*
      - ScannerThreadsSysTime: *38.22s*
      - ScannerThreadsUserTime: *1.7m*
   - ScannerThreadsVoluntaryContextSwitches: *480,870*
   - TotalKuduScanRoundTrips: *52,142*
   - TotalReadThroughput: *0 B/s*
   - TotalTime: *4.28s*

Mime
View raw message