kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Boris Tyukin <bo...@boristyukin.com>
Subject Re: first and second run 2x query time difference
Date Sat, 16 Dec 2017 19:45:23 GMT
well our admin had fun two days - it was the first time we restarted Kudu
on our DEV cluster and it did not go well. He is still troubleshooting what
happened but after Kudu restart zookeeper and HDFS went down after 3-4
minutes. If we disable Kudu, all is well. No error in Kudu logs...I will
have more details next week so not asking for help as I do not know all the
details. What is obvious thought is that it has to do something with Kudu :)

On Thu, Dec 14, 2017 at 9:40 AM, Boris Tyukin <boris@boristyukin.com> wrote:

> thanks for your suggestions, J-D, I am sure you are right more often than
> that! :))
>
> I will report back with our results. So far I am really impressed with
> Kudu - we have been benchmarking ingest and egress throughput and our
> typical queries runtime. The biggest pain so far is lack of support for
> decimals
>
> On Wed, Dec 13, 2017 at 5:07 PM, Jean-Daniel Cryans <jdcryans@apache.org>
> wrote:
>
>> On Wed, Dec 13, 2017 at 11:30 AM, Boris Tyukin <boris@boristyukin.com>
>> wrote:
>>
>>> thanks J-D! we are going to try that and see how it impacts the
>>> runtime.
>>>
>>> is there any way to load this metadata upfront? a lot of our queries are
>>> adhoc in nature but they will be hitting the same tables with different
>>> predicates and join patterns though.
>>>
>>
>> You could use Impala to compute all the stats of all the tables after
>> each Kudu restart. Actually, do try that, restart Kudu then compute stats
>> and see how fast it scans.
>>
>>
>>>
>>> I am curious why this metadata does not survive restarts though. We are
>>> going to run our benchmarks again and this time restart Kudu and Impala.
>>>
>>
>> It's in the tserver memory, it can't survive a restart.
>>
>>
>>>
>>> I just ran another query first time which hits 2 large tables and these
>>> tables have been scanned by the previous query and this time I do not see
>>> any difference in query time before the first and second time - I guess
>>> this confirms your statement about " first time ever scanning the table
>>> since a Kudu restart" and collecting metadata.
>>>
>>
>> Maybe, I've been known to be right once or twice a year :)
>>
>>
>>>
>>>
>>> On Wed, Dec 13, 2017 at 11:18 AM, Jean-Daniel Cryans <
>>> jdcryans@apache.org> wrote:
>>>
>>>> Hi Boris,
>>>>
>>>> Given that we don't have much data we can use here, I'll have to
>>>> extrapolate. As an aside though, this is yet another example where we need
>>>> more Kudu-side metrics in the query profile.
>>>>
>>>> So, Kudu lazily loads a bunch of metadata and that can really affect
>>>> scan times. If this was your first time ever scanning the table since a
>>>> Kudu restart, it's very possible that that's where that time was spent.
>>>> There's also the page cache in the OS that might now be populated. You
>>>> could do something like "sync; echo 3 > /proc/sys/vm/drop_caches" on all
>>>> the machines and run the query 2 times again, without restarting Kudu, to
>>>> understand the effect of the page cache itself. There's currently now way
>>>> to purge the cached metadata in Kudu though.
>>>>
>>>> Hope this helps a bit,
>>>>
>>>> J-D
>>>>
>>>> On Wed, Dec 13, 2017 at 8:07 AM, Boris Tyukin <boris@boristyukin.com>
>>>> wrote:
>>>>
>>>>> Hi guys,
>>>>>
>>>>> I am doing some benchmarks with Kudu and Impala/Parquet and hope to
>>>>> share it soon but there is one thing that bugs me. This is perhaps Impala
>>>>> question but since I am using Kudu with Impala I am going to try and
ask
>>>>> anyway.
>>>>>
>>>>> One of my queries takes 120 seconds to run the very first time. It
>>>>> joins one large 5B row table with a bunch of smaller tables and then
stores
>>>>> result in Impala/parquet (not Kudu).
>>>>>
>>>>> Now if I run it second and third time, it only takes 60 seconds. Can
>>>>> someone explain why? Is there any settings to decrease this gap?
>>>>>
>>>>> I've compared query profiles in CM and the only thing that was very
>>>>> different is scan against Kudu table (the large one):
>>>>>
>>>>> ***************************
>>>>> first time:
>>>>> ***************************
>>>>> KUDU_SCAN_NODE (id=0) (47.68s)
>>>>> <https://lkmaorabd103.multihosp.net:7183/cmf/impala/queryDetails?queryId=5143f7165be82819%3Ae00a103500000000&serviceName=impala#>
>>>>>
>>>>>
>>>>>
>>>>>    - BytesRead: *0 B*
>>>>>    - InactiveTotalTime: *0ns*
>>>>>    - KuduRemoteScanTokens: *0*
>>>>>    - NumScannerThreadsStarted: *20*
>>>>>    - PeakMemoryUsage: *35.8 MiB*
>>>>>    - RowsRead: *693,502,241*
>>>>>    - RowsReturned: *693,502,241*
>>>>>    - RowsReturnedRate: *14643448 per second*
>>>>>    - ScanRangesComplete: *20*
>>>>>    - ScannerThreadsInvoluntaryContextSwitches: *1,341*
>>>>>    - ScannerThreadsTotalWallClockTime: *36.2m*
>>>>>       - MaterializeTupleTime(*): *47.57s*
>>>>>       - ScannerThreadsSysTime: *31.42s*
>>>>>       - ScannerThreadsUserTime: *1.7m*
>>>>>    - ScannerThreadsVoluntaryContextSwitches: *96,855*
>>>>>    - TotalKuduScanRoundTrips: *52,308*
>>>>>    - TotalReadThroughput: *0 B/s*
>>>>>    - TotalTime: *47.68s*
>>>>>
>>>>>
>>>>> ***************************
>>>>> second time:
>>>>> ***************************
>>>>> KUDU_SCAN_NODE (id=0) (4.28s)
>>>>> <https://lkmaorabd103.multihosp.net:7183/cmf/impala/queryDetails?queryId=53497a308f860837%3A243772e000000000&serviceName=impala#>
>>>>>
>>>>>
>>>>>
>>>>>    - BytesRead: *0 B*
>>>>>    - InactiveTotalTime: *0ns*
>>>>>    - KuduRemoteScanTokens: *0*
>>>>>    - NumScannerThreadsStarted: *20*
>>>>>    - PeakMemoryUsage: *37.9 MiB*
>>>>>    - RowsRead: *693,502,241*
>>>>>    - RowsReturned: *693,502,241*
>>>>>    - RowsReturnedRate: *173481534 per second*
>>>>>    - ScanRangesComplete: *20*
>>>>>    - ScannerThreadsInvoluntaryContextSwitches: *1,451*
>>>>>    - ScannerThreadsTotalWallClockTime: *19.5m*
>>>>>       - MaterializeTupleTime(*): *4.20s*
>>>>>       - ScannerThreadsSysTime: *38.22s*
>>>>>       - ScannerThreadsUserTime: *1.7m*
>>>>>    - ScannerThreadsVoluntaryContextSwitches: *480,870*
>>>>>    - TotalKuduScanRoundTrips: *52,142*
>>>>>    - TotalReadThroughput: *0 B/s*
>>>>>    - TotalTime: *4.28s*
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message