bookkeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aniruddha Laud <trojan.of.t...@gmail.com>
Subject Re: Low write bandwidth
Date Wed, 10 Jun 2015 16:04:43 GMT
On Wed, Jun 10, 2015 at 8:38 AM, Maciej Smoleński <jezdnia@gmail.com> wrote:

> I run ping -s 65000 and the results are below.
> Latency is always <1.5 ms.
> Does it mean that for transporting single entry two packets will be used
> and the latency will be: 2.5 ms (1.5 ms for (65K) and 1 ms for (35K) => 2.5
> ms for 100K) ?
>
No. ping gives you the round-trip-time, but you can expect the latency to
be a little higher than this number (best way to find this would be to
monitor the actual write latency to the server. Bookkeeper clients expose
these metrics.)

Is it possible to improve this ? Is it possible to increase packet size, so
> that single entry fits single packet ?
>
Can't increase IP packet size. You'll have to reduce the size of each entry
to avoid fragmentation.

>
>
>
> ping/from_client_to_server1
> PING SN0101 (169.254.1.31) 65000(65028) bytes of data.
> 65008 bytes from SN0101 (169.254.1.31): icmp_seq=1 ttl=64 time=1.39 ms
> 65008 bytes from SN0101 (169.254.1.31): icmp_seq=2 ttl=64 time=1.29 ms
> 65008 bytes from SN0101 (169.254.1.31): icmp_seq=3 ttl=64 time=1.29 ms
> 65008 bytes from SN0101 (169.254.1.31): icmp_seq=4 ttl=64 time=1.31 ms
> 65008 bytes from SN0101 (169.254.1.31): icmp_seq=5 ttl=64 time=1.32 ms
>
> ping/from_client_to_server2
> PING SN0102 (169.254.1.32) 65000(65028) bytes of data.
> 65008 bytes from SN0102 (169.254.1.32): icmp_seq=1 ttl=64 time=1.26 ms
> 65008 bytes from SN0102 (169.254.1.32): icmp_seq=2 ttl=64 time=1.31 ms
> 65008 bytes from SN0102 (169.254.1.32): icmp_seq=3 ttl=64 time=1.12 ms
> 65008 bytes from SN0102 (169.254.1.32): icmp_seq=4 ttl=64 time=1.27 ms
> 65008 bytes from SN0102 (169.254.1.32): icmp_seq=5 ttl=64 time=1.37 ms
>
> ping/from_client_to_server3
> PING SN0103 (169.254.1.33) 65000(65028) bytes of data.
> 65008 bytes from SN0103 (169.254.1.33): icmp_seq=1 ttl=64 time=1.25 ms
> 65008 bytes from SN0103 (169.254.1.33): icmp_seq=2 ttl=64 time=1.38 ms
> 65008 bytes from SN0103 (169.254.1.33): icmp_seq=3 ttl=64 time=1.25 ms
> 65008 bytes from SN0103 (169.254.1.33): icmp_seq=4 ttl=64 time=1.33 ms
> 65008 bytes from SN0103 (169.254.1.33): icmp_seq=5 ttl=64 time=1.32 ms
>
> ping/from_server1_to_client
> PING AN0101 (169.254.1.11) 65000(65028) bytes of data.
> 65008 bytes from AN0101 (169.254.1.11): icmp_seq=1 ttl=64 time=1.01 ms
> 65008 bytes from AN0101 (169.254.1.11): icmp_seq=2 ttl=64 time=1.38 ms
> 65008 bytes from AN0101 (169.254.1.11): icmp_seq=3 ttl=64 time=1.35 ms
> 65008 bytes from AN0101 (169.254.1.11): icmp_seq=4 ttl=64 time=1.35 ms
> 65008 bytes from AN0101 (169.254.1.11): icmp_seq=5 ttl=64 time=1.32 ms
>
> ping/from_server2_to_client
> PING AN0101 (169.254.1.11) 65000(65028) bytes of data.
> 65008 bytes from AN0101 (169.254.1.11): icmp_seq=1 ttl=64 time=0.887 ms
> 65008 bytes from AN0101 (169.254.1.11): icmp_seq=2 ttl=64 time=1.31 ms
> 65008 bytes from AN0101 (169.254.1.11): icmp_seq=3 ttl=64 time=1.32 ms
> 65008 bytes from AN0101 (169.254.1.11): icmp_seq=4 ttl=64 time=0.998 ms
> 65008 bytes from AN0101 (169.254.1.11): icmp_seq=5 ttl=64 time=1.22 ms
>
> ping/from_server3_to_client
> PING AN0101 (169.254.1.11) 65000(65028) bytes of data.
> 65008 bytes from AN0101 (169.254.1.11): icmp_seq=1 ttl=64 time=1.08 ms
> 65008 bytes from AN0101 (169.254.1.11): icmp_seq=2 ttl=64 time=1.40 ms
> 65008 bytes from AN0101 (169.254.1.11): icmp_seq=3 ttl=64 time=1.07 ms
> 65008 bytes from AN0101 (169.254.1.11): icmp_seq=4 ttl=64 time=1.26 ms
> 65008 bytes from AN0101 (169.254.1.11): icmp_seq=5 ttl=64 time=1.26 ms
> 65008 bytes from AN0101 (169.254.1.11): icmp_seq=6 ttl=64 time=1.26 ms
>
> On Wed, Jun 10, 2015 at 4:45 PM, Aniruddha Laud <trojan.of.troy@gmail.com>
> wrote:
>
>>
>>
>> On Wed, Jun 10, 2015 at 7:00 AM, Maciej Smoleński <jezdnia@gmail.com>
>> wrote:
>>
>>> Thank You for Your comment.
>>>
>>> Unfortunately, these option will not help in my case.
>>> In my case BookKeeper client will receive next request when previous
>>> request is confirmed.
>>> It is expected also that there will be only single stream of such
>>> requests.
>>>
>>> I would like to understand how to achieve performance equal to the
>>> network bandwidth.
>>>
>>
>> to saturate bandwidth, you will have to have more than one outstanding
>> request. 250 requests/second gives you 4ms per request. With each entry
>> 100K in size, that's not unreasonable. My suggestion would be to monitor
>> the write latency from the client to the server.
>>
>> ping -s 65000 should give you a baseline for what to expect with
>> latencies.
>>
>> With 100K packets, you are going to see fragmentation at both the IP and
>> the Ethernet layer. That wasn't the case with 1K payload.
>>
>> How many hops does one need to go from one machine to another? - higher
>> the hops, higher the latency
>>
>>
>>>
>>>
>>> On Wed, Jun 10, 2015 at 2:27 PM, Flavio Junqueira <fpjunqueira@yahoo.com
>>> > wrote:
>>>
>>>> BK currently isn't wired to stream bytes to a ledger, so writing
>>>> synchronously large entries as you're doing is likely not to get the best
>>>> its performance. A couple of things you could try to get higher performance
>>>> are to write asynchronously and to have multiple clients writing.
>>>>
>>>> -Flavio
>>>>
>>>>
>>>>
>>>>
>>>>   On Wednesday, June 10, 2015 12:08 PM, Maciej Smoleński <
>>>> jezdnia@gmail.com> wrote:
>>>>
>>>>
>>>>
>>>> Hello,
>>>>
>>>> I'm testing BK performance when appending 100K entries synchronously
>>>> from 1 thread (using one ledger).
>>>> The performance I get is 250 entries/s.
>>>>
>>>> What performance should I expect ?
>>>>
>>>> My setup:
>>>>
>>>> Ledger:
>>>> Ensemble size: 3
>>>> Quorum size: 2
>>>>
>>>> 1 client machine and 3 server machines.
>>>>
>>>> Network:
>>>> Each machine with bonding: 4 x 1000Mbps on each machine
>>>> manually tested between client and server: 400MB/s
>>>>
>>>> Disk:
>>>> I tested two configurations:
>>>> dedicated disks with ext3 (different for zookeeper, journal, data,
>>>> index, log)
>>>> dedicated ramfs partitions (different for zookeeper, journal, data,
>>>> index, log)
>>>>
>>>> In both configurations the performance is the same: 250 entries / s
>>>> (25MB / s).
>>>> I confirmed this with measured network bandwidth:
>>>> - on client 50 MB/s
>>>> - on server 17 MB/s
>>>>
>>>> I run java with profiler enabled on BK client and BK server but didn't
>>>> find anything unexpected (but I don't know bookkeeper internals).
>>>>
>>>> I tested it with two BookKeeper versions:
>>>> - 4.3.0
>>>> - 4.2.2
>>>> The result were the same with both BookKeeper versions.
>>>>
>>>> What should be changed/checked to get better performance ?
>>>>
>>>> Kind regards,
>>>> Maciej
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>

Mime
View raw message