openwhisk-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Thömmes <markusthoem...@apache.org>
Subject Re: the behavior of batcher
Date Fri, 09 Nov 2018 08:10:40 GMT
Hi Jiang,

this is a fair question. The trade-off of using groupedWithin vs. batch is,
that groupedWithin **always** adds some latency to the database commands,
where batch only adds that latency on backpressure, as you've noticed.

Cheers,
Markus

Am Fr., 9. Nov. 2018 um 04:39 Uhr schrieb 蒋鹏程 <jiang.pengcheng@navercorp.com
>:

> Hello all,​
> ​
> I just noticed that the db batcher doesn't behave as what I expected,
> seems it has a fixed size of workers, and each worker will get data from
> stream eagerly, so the batcher will try to insert 1 document/per requestat
> first, when all workers are busy, subsequent ​documents will be put into
> batches, and next free worker will process the batched documents,and so on.
> ​
> I think this behavior will not take full advantage of `bulk` API of
> database backend
> ​
> When there is 10,000 req/s, and the default size of workers is 64, let's
> assume every worker need take exact 10ms to complete its job, then there
> will be around 3250 res/s against the database theoretically:
> ​
> let's make 10ms as a time unit
> 1. in first 10ms, there are 100 documents in the stream, and 64 workers
> get 64 of them, 36 documents is left
> 2. in the next 10 ms, there are 100 + 36 documents in the stream, and 1
> worker can process all of them
> 3. the next 10ms * 2 is just like phrase 1 and phrase 2
> ​
> so the req/s against database backend will be (64+1)*50=3250
> ​
> is it better to use `groupedWithin` instead of `batch` here?
> ​
> Best Regards
> Jiang PengCheng
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message