openwhisk-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 蒋鹏程<>
Subject the behavior of batcher
Date Fri, 09 Nov 2018 03:39:10 GMT
Hello all,​
I just noticed that the db batcher doesn't behave as what I expected, seems it has a fixed
size of workers, and each worker will get data from stream eagerly, so the batcher will try
to insert 1 document/per requestat first, when all workers are busy, subsequent ​documents
will be put into batches, and next free worker will process the batched documents,and so on.
I think this behavior will not take full advantage of `bulk` API of database backend
When there is 10,000 req/s, and the default size of workers is 64, let's assume every worker
need take exact 10ms to complete its job, then there will be around 3250 res/s against the
database theoretically:
let's make 10ms as a time unit
1. in first 10ms, there are 100 documents in the stream, and 64 workers get 64 of them, 36
documents is left
2. in the next 10 ms, there are 100 + 36 documents in the stream, and 1 worker can process
all of them
3. the next 10ms * 2 is just like phrase 1 and phrase 2
so the req/s against database backend will be (64+1)*50=3250
is it better to use `groupedWithin` instead of `batch` here?
Best Regards
Jiang PengCheng
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message