kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Boris Tyukin <bo...@boristyukin.com>
Subject Re: Changing number of Kudu worker threads
Date Thu, 14 Feb 2019 13:39:15 GMT
Hi J-D and thanks for your comments.

On a very high level, we subscribe to 750 Kafka topics and wrote a custom
app using Java API to insert, update or delete into 250 Kudu tables
(messages from 3 topics are merged into a single Kudu table). Our custom
code can spawn any number of threads and we experimented with 10,20 and 50.

When we did our first test on the development 6 node cluster (high-density
2x22 core beast) for 5 tables/15 topics with 10 concurrent threads  - all
was good and promising.
Once we've added all tables/topics, our process became very slow and
throughput dropped by 20-30 times. We increased the number of threads for
our custom code to 50 and this is when we noticed that Kudu uses only 10
threads and other threads are waiting in the queue.

Out target Kudu tables were empty when we started and the cluster was
pretty much idle.

Here are the results of our benchmarks that you might find interesting.

“threads/queue” in the header is rpc_service_threads ,
rpc_service_queue_length.

we started with defaults and test 5 and 6 turned out to be the best. Quite
a dramatic difference with test 1.

I should also mention, we are running this on our DEV 6 node cluster but it
is pretty beefy (2x24 cpu core, 256Gb of Ram, 12 disks etc.) and the
cluster was not doing anything else but only writing into Kudu.

It is also interesting that test 7 did not give any further improvements -
our speculation here is that we just hit the limits of our 6 node Kudu
cluster, since it can handle so many tablets at once and we use a
replication factor of 3.

Another test we did, we created a simple app that would run selects on
these tables, while the first app keeps writing into these tables.
Throughput dropped quite a bit as well with the defaults but bumping rpc
threads helped.

If you have any other thoughts/observations, I would like to hear them!

I think things like that should be somewhere in the Kudu doc along with a
few important parameters that new orgs to Kudu must tweak. I can write a
blog post about it, but I am no Kudu dev so do not want to represent
anything.

For example, we've learned the hard way to tweak these two parameters right
away as Insert performance was terrible out of the box:

[image: Machine generated alternative text: Will Berkeley 4:40 AM @wangxg
you shouldn't need to tune too many parameters despite the large number of
available ones tune --memory_limit_hard_bytes to control the total amount
of memory kudu will use tune --maintenance_manager_num_threads to about 1/3
of the number of disks you are using for kudu (assuming you are using the
latest version, 1.5)]


  custom app threads Master Tablet Total Operations# Total Duration from
start to finish Avg Operations # per second Avg duration ms per flowfile Avg
# of operations per flowfile
threads,queue threads,queue
test 1 10 10,50 10,50 5,906,849 60 minutes 98,447 404.389279 ms 66.8861423
test 2 30 10,50 10,50 3,107,938 32 minutes 97,123 1611.71499 ms 102.416727
test 3 10 30,100 10,50 13,617,274 60 minutes 226,954 448.954472 ms
196.878148
test 4 30 30,100 10,50 5,794,268 60 minutes 96,571 2342.55094 ms 114.822107
test 5 10 30,100 30,100 16,813,710 60 minutes 280,228 391.113644 ms
183.887024
test 6 30 30,100 30,100 15,903,303 60 minutes 265,055 2300.38629 ms
341.316543
test 7 30 50,200 50,200 12,549,114 60 minutes 209,151 2364.45707 ms
276.851262

On Wed, Feb 13, 2019 at 11:39 AM Jean-Daniel Cryans <jdcryans@apache.org>
wrote:

> Some comments on the original problem: "we need to process 1000s of
> operations per second and noticed that our Kudu 1.5 cluster was only using
> 10 threads while our application spins up 50 clients/threads"
>
> I wouldn't directly infer that 20 threads won't be enough to match your
> needs. The time it takes to service a request can vary greatly, a single
> thread could process 500 operations that take 2ms to run, or 2 that take
> 500ms to run, and you have 20 of those. The queue is there to make sure
> that the threads are kept busy instead of bouncing the clients back the
> moment all the threads are occupied. Your 50 threads can't constantly pound
> all the tservers, there's time spent on the network and whatever processing
> needs to happen client-side before they go back to Kudu.
>
> TBH there's not a whole lot of science around how we set those two
> defaults (# of threads and queue size), but it's very workload-dependent.
> Ideally the tservers would just right-size the pools based on the kind of
> requests that are coming in and the amount of memory it can use. I guess
> CPU also comes in the picture but again it depends on the workload, Kudu
> stores data so it tends to be IO-bound more than CPU-bound.
>
> But the memory concern is very real. To be put in the queue the requests
> must be read from the network, so it doesn't take that many 2MB batches of
> inserts to occupy a lot of memory. Scans, on the other hand, become a
> memory concern in the threads because that's where they materialize data in
> memory and, depending on the number of columns scanned and the kind of data
> that's read, it could be a lot. That's why the defaults aren't arbitrarily
> high, they're more on the safe side.
>
> Have you actually encountered performance issues that you could trace back
> to this?
>
> Thanks,
>
> J-D
>
> On Wed, Feb 13, 2019 at 3:49 AM Boris <boriskey@gmail.com> wrote:
>
>> But if we bump threads count to 50, and queue default is 50, we should
>> probably bump queue to 100 or something like that, right?
>>
>> On Wed, Feb 13, 2019, 00:54 Hao Hao <hao.hao@cloudera.com wrote:
>>
>>> I don't see other flags that are relevant here, maybe others can chime
>>> in.
>>>
>>> For --rpc_service_queue_length, it configs the size of the RPC request
>>> queues. The queue helps to buffer requests in case if there is a bunch of
>>> them coming at once and service threads are too busy processing already
>>> arrived requests. But I don't see it can help with handling more concurrent
>>> requests.
>>>
>>> Best,
>>> Hao
>>>
>>> On Tue, Feb 12, 2019 at 6:45 PM Boris <boriskey@gmail.com> wrote:
>>>
>>>> Thanks Hao, appreciate your response.
>>>>
>>>> Do we also need to bump other RPC thread related parameters queue etc.?
>>>>
>>>> On Tue, Feb 12, 2019, 21:09 Hao Hao <hao.hao@cloudera.com wrote:
>>>>
>>>>> Hi Boris,
>>>>>
>>>>> Sorry for the delay,  --rpc_num_service_threads sets the number of
>>>>> threads in RPC service thread pool (the default is 20 for tablet
>>>>> server, 10 for master).  It should help with processing concurrent incoming
>>>>> RPC requests, but increasing it more than the number of available CPU
cores
>>>>> of the machines may not bring too much value.
>>>>>
>>>>> You don't need to set the same value for masters and tablet servers.
>>>>> Most of the time, tablet servers should have more RPCs where the scans
and
>>>>> writes are taking place.
>>>>>
>>>>> Best,
>>>>> Hao
>>>>>
>>>>> On Tue, Feb 12, 2019 at 5:29 PM Boris Tyukin <boris@boristyukin.com>
>>>>> wrote:
>>>>>
>>>>>> Can someone point us to documentation or explain what these
>>>>>> parameters really mean or how they should be set on production cluster?
>>>>>> I will greatly appreciate it!
>>>>>>
>>>>>> Boris
>>>>>>
>>>>>> On Fri, Feb 8, 2019 at 3:40 PM Boris Tyukin <boris@boristyukin.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi guys,
>>>>>>>
>>>>>>> we need to process 1000s of operations per second and noticed
that
>>>>>>> our Kudu 1.5 cluster was only using 10 threads while our application
spins
>>>>>>> up 50 clients/threads. We observed in the web UI that only 10
threads are
>>>>>>> working and other 40 waiting in the queue.
>>>>>>>
>>>>>>> We found rpc_num_service_threads parameter in the configuration
>>>>>>> guide but it is still not clear to me what we need to adjust
exactly to
>>>>>>> allow Kudu to handle more concurrent operations.
>>>>>>>
>>>>>>> Do we bump this parameter below or we need to consider other
>>>>>>> rpc related parameters?
>>>>>>>
>>>>>>> Also do we need to use the same numbers for Masters and tablets?
>>>>>>>
>>>>>>> Is there any good numbers to target based on CPU core count?
>>>>>>>
>>>>>>> --rpc_num_service_threads
>>>>>>> <https://kudu.apache.org/docs/configuration_reference.html#kudu-master_rpc_num_service_threads>
>>>>>>> <https://kudu.apache.org/docs/configuration_reference.html#kudu-master_rpc_num_service_threads>
>>>>>>>
>>>>>>> Number of RPC worker threads to run
>>>>>>>
>>>>>>> Type
>>>>>>>
>>>>>>> int32
>>>>>>>
>>>>>>> Default
>>>>>>>
>>>>>>> 10
>>>>>>>
>>>>>>> Tags
>>>>>>>
>>>>>>> advanced
>>>>>>>
>>>>>>

Mime
View raw message