mina-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antonio Rodriges <antonio....@gmail.com>
Subject Re: Too slow network IO
Date Fri, 10 Aug 2012 15:46:30 GMT
>> After more analysis of statistics we identified the bottleneck but not
>> found yet the solution
>>
>> You are right that when client waits for the respose, the throughput
>> will be low. However, we measure all stages of the query processing.
>> And the transfer stage takes long. We synchronize the clock between
>> the server, gate (see below) and the clinet within +/-5 ms. 10 clients
>> can generate and receive responses for about 7 queries per minute.
>
>
> This is very low.
>
>> Both client and server use Mina.
>>
>> Client:
>> while ( time not finished )
>> {
>>    q = generateQueryString            /// it is several bytes
>>    send (q)
>>    wait for response ()
>> }  // that;s all
>>
>> There is a gate (also Mina based) which simply retranferes queries to
>> servers and results back to clients:
>>
>> Gate (has 32 nio acceptors)
>> Unpack query
>> Parse query
>> Choose server
>> Transfer request → server
>>
>> Server
>> Unpack request
>> Extract data (implies disk IO)
>> Create response
>> Pack response
>> Transfer resp → Gate
>>
>> Again Gate:
>> Unpack response from server
>> RePack response for client
>> Transfer  Gate → Client
>
>
> so this is :
>
> Client --4Mb--> Gate --4Mb--> Server --4Mb--> Disk
The direction is opposite
Client sends small query (several bytes) and receives 4MB response

> and back (assuming that the response is just an Ack).
>
> Transfering 4Mb on a 1Gb/s network costs 1000/40 = 25 ms, so it will take 50
> ms to transfer a message from a client to the server. You will saturate the
> network with 20 messages sent per second.
>
> Writing 4Mb on disk will take roughly between 50 to 100 ms too (faster if
> you use a SSD).
>
> The global roundtrip will take around 25ms + 25ms + 50ms = around 100ms. You
> should mesure this atomic roundtrip. That also means you will be able to
> process 10message per second on a single client, but no more than 20
> messages per second before saturating the network.
Currently that 4MB from gate to client go in 7 secs on average when
running 10 clients, while this number drops to abount 230 ms when
using a single client

>>
>> The overall performance must be dominated by disk IO which is
>> currently up to 200 ms. However, as I mentioned before, we measure all
>> stages.
>
> 200 ms to write data on disk is very slow... That will limit even further
> the number of message a client can process every second : 25ms + 25ms +200 =
> around 300ms to process a message, so this is 3 messages per second per
> client.
That is true, but we measure also the time of each stage and that
revealed the transfer bottlenecks

>>
>> The median for a stage is given right to each of them. The statistics
>> is for 10 clients and 4 MB query results. They are able to receive 7
>> queries per minute.
>
> So you mean that with 10 clients sending queries, each client is capable of
> processing 0,7 message per *minute* = 0,01 per second, max ?
7 per minute each of them takes about 7 secs on avg,
so 10 clients can receive 70 responses per minute in total
>
>>
>> The query response comes to client on average in 7237 ms.While Gate->
>> client takes 6627 ms and response server -> gate takes only 220 ms. on
>> average It is interesting why gate does not keep pace with the overall
>> load? While it simply retransfers the message.
>
>
> client --> gate ..>
> client <-- gate <..
>
> takes 7, 237 seconds,
>
> and
>
> ..>gate --> server
> <..gate <-- server
>
> takes only 220 ms ? (which is on line with the back the envelop math from
> the beginning of my response)
>

 client --> gate ..> a couple of ms (send query)
 client <-- gate <.. takes 7, 237 seconds, (transfer 4MB)

and

 ..>gate --> server     a coule of ms (retransfer query)
 <..gate <-- server      220 ms

>> Maybe too many IO for a
>> single machine or Mina maybe tuned?
>
> I'm just wondering what's the Gate doing... You may want to add a Logger
> filter in the Gate, on both side (if it's using MINA too) to see where you
> are losing time.

The gate must receive the query, find the server with the data and
issue the query to that server. The server reads the data and
transfers it to client through gate.
Currently we test the system with a single server. Possibly direct
client -> server connection for data transfer would be faster, but at
the current stage we are trying to get the resonably working prototype
with the scheme where data transits through gate. Gate, client and
server use ObjectSerialization codecs to serialize objects with the
data.

> Also if you have routers between the client and the gate, I think you should
> check them.

The gate and client as well as the server are on the same switch

Mime
View raw message