accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Fuchs <afu...@apache.org>
Subject Re: Why may "tablet read ahead" take long time? (was: Profile a (batch) scan)
Date Tue, 15 Jan 2019 18:01:31 GMT
Hi Maxim,

What you're seeing is an artifact of the threading model that Accumulo
uses. When you launch a query, Accumulo tablet servers will coordinate RPCs
via Thrift in one thread pool (which grows unbounded) and queue up scans
(rfile lookups, decryption/decompression, iterators, etc.) in another
threadpool known as the readahead pool (which has a fixed number of
threads). You're seeing everything that happens in that readahead thread in
one big chunk. You may need to look a bit deeper into profiling/sampling
tablet server CPU to get insights into how to improve your query
performance. If you want to speed up queries in general you might try (in
no particular order):
1. Increase parallelism by bumping up the readahead threads
(tserver.readahead.concurrent.max). This will still be bounded by the
number of parallel scans clients are driving.
2. Increase parallelism driven by clients by querying more, smaller ranges,
or by splitting tablets.
3. Increase scan batch sizes if the readahead thread or thrift coordination
overhead is high.
4. Optimize custom iterators if that is a CPU bottleneck.
5. Increase cache sizes or otherwise modify queries to improve cache hit
rates.
6. Change compression settings if that is a CPU bottleneck. Try snappy
instead of gz.

Cheers,
Adam

On Tue, Jan 15, 2019, 10:45 AM Maxim Kolchin <kolchinmax@gmail.com wrote:

> Hi all,
>
> I try to trace some scans with Zipkin and see that quite often the trace
> called "tablet read ahead" takes 10x or 100x more time than the other
> similar traces.
>
> Why it may happen? What could be done to reduce the time? I found a
> similar discussion on the list, but it doesn't have an answer. I'd be great
> to have a how-to article listing some steps which could be done.
>
> Attaching a screenshot of one of the traces having this issue.
>
> Maxim Kolchin
>
> E-mail: kolchinmax@gmail.com
> Tel.: +7 (911) 199-55-73
> Homepage: http://kolchinmax.ru
>
> Below you can find a good example of what I'm struggling to understand
>> right now. It's a trace for a simple scan over some columns with a
>> BatchScanner using 75 threads. The scan takes 877 milliseconds and the main
>> contributor is the entry "tablet read ahead 1", which starts at 248 ms.
>> These are the questions that I cannot answer with this trace:
>>
>>    1. why this heavy operation starts after 248ms? By summing up the delay
>>    before this operation you get a number which is not even close to 248ms.
>>    2. what does "tablet read ahead 1" means? In general, how to map the
>>    entries of a trace to their meaning? Is there a guide about this?
>>    3. why "tablet read ahead 1" takes 600ms? It's clearly not the sum of
>>    the entries under this one but that's the important part.
>>    4. I may be naive but...how much data have been read by this scan? How
>>    many entries? That's very important to understand what's going on.
>>
>> Thanks for the help,
>>
>> Mario
>>
>> 877+ 0 Dice@h01 counts
>> 2+ 7 tserver@h12 startScan
>> 6+ 10 tserver@h15 startScan
>> 5+ 11 tserver@h15 metadata tablets read ahead 4
>> 843+ 34 Dice@h01 batch scanner 74- 1
>> 620+ 230 tserver@h09 startMultiScan
>> 600+ 248 tserver@h09 tablet read ahead 1
>> 22+ 299 tserver@h09 newDFSInputStream
>> 22+ 299 tserver@h09 getBlockLocations
>> 2+ 310 tserver@h09 ClientNamenodeProtocol#getBlockLocations
>> 1+ 321 tserver@h09 getFileInfo
>> 1+ 321 tserver@h09 ClientNamenodeProtocol#getFileInfo
>> 2+ 322 tserver@h09 DFSInputStream#byteArrayRead
>> 1+ 324 tserver@h09 DFSInputStream#byteArrayRead
>> 2+ 831 tserver@h09 DFSInputStream#byteArrayRead
>> 2+ 834 tserver@h09 DFSInputStream#byteArrayRead
>> 1+ 835 tserver@h09 BlockReaderLocal#fillBuffer(1091850413)
>> 1+ 874 tserver@h09 closeMultiScan
>> --
>> Mario Pastorelli | TERALYTICS
>>
>> *software engineer*
>>
>> Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland
>> phone: +41794381682
>> email: mario.pastorelli@teralytics.chwww.teralytics.net
>>
>> Company registration number: CH-020.3.037.709-7 | Trade register Canton
>> Zurich
>> Board of directors: Georg Polzer, Luciano Franceschina, Mark Schmitz, Yann
>> de Vries
>>
>> This e-mail message contains confidential information which is for the sole
>> attention and use of the intended recipient. Please notify us at once if
>> you think that it may not be intended for you and delete it immediately.
>>
>>

Mime
View raw message