Ibrahim, I'm not completely sure I'm correct about where the ZK server starts and stops its latency counter. Can somebody confirm/correct? If I am right, suppose you have 3 requests, A, B, C. Here's how the sequence of events might occur. Request A arrives a ZK server. start the latency counter for A Start fsync for A Request B arrives at ZK server. Start the latency counter for B Request C arrives at ZK server. Start the latency counter for C Fsync for A completes. Stop the latency counter for A and compute latency Start fsync for B Fsync for B completes. Stop the latency counter for B and compute latency Start fsync for C Fsync for C completes. Stop the latency counter for C and compute latency So you can see that the latency for A is the same as it would be in sync mode, but the latency for B includes part of the fsync time for A and the latency for C includes all of fsync time for B and part of the fsync time for A. Regarding your latency calculations, how do they compare with the ZK stat values when running in sync mode? I think they should be just a little bit bigger than the stat values since you are doing them on the client and network transmission time is going to add a little bit. Henry May IBM InfoSphere Streams Performance hjmay@us.ibm.com 720-342-8873 Tie: 963-8873 From: Ibrahim To: zookeeper-user@hadoop.apache.org Date: 10/23/2014 04:37 PM Subject: RE: Latency in asynchronous mode Thank you Henry, Ok, this makes sense. So, we can see that in sync mode the latency will measure for just one operation at the time (per fsync), because the transaction log system will log (fsync) one request at the time. Whereas, in Asyc mode, the latency will measure several operations per fsync, meaning that it is not possible to measure the latency per operation because the transaction log will (fsync) batch multiple requests in one fsync. Can you correct me if the above is not true? *What does ZK latency measure? Does it measure the delta between the arrival of the request and the completion of the request? I have two different ways to measure the latency which are as following: 1- I use four word command which is stat, it gives me like: Five clients: Latency min/avg/max: 235/366/515 Ten clients: Latency min/avg/max: 252/368/505 2- I use my own code which is submitTimeWrite = (double)System.nanoTime(); _client.create().inBackground(new Double(time)).forPath(_path + "/" + _count, data); endTimeWrite = (double)System.nanoTime(); latencyInfos.add(""+((endTimeWrite - submitTimeWrite)/1000000)); However, the result using the above code is completed different compared to stat command result. The following sample result generates using the above code: 0.004395 0.004297 0.004256 0.004308 0.004353 0.004293 0.004309 0.004421 0.004325 Here there is a question arises, which is the right and logical result I should take into account? It seems that the above result measures latency in Async mode request by request, whereas the result using the stat command measures batch multiple requests. Thank you a lot Ibrahim From: Henry May [via zookeeper-user] [ mailto:ml-node+s578899n7580451h5@n2.nabble.com] Sent: Thursday, October 23, 2014 04:44 ã To: Ibrahim El-sanosi (PGR) Subject: RE: Latency in asynchronous mode I'm a ZK newbie, but I have a hypothesis to test. At least on my synchronous standalone ZK server I've been able to correlate ZK latency to disk write response time. I have a perl script that harvests the ZK stats on regular intervals. At the beginning of each interval it does srst so I truly get the max latency on that interval. Whenever I find an interval with long max latency, I can invariably find an egregiously long disk IO operation in a block IO trace. What does ZK latency measure? Does it measure the delta between the arrival of the request and the completion of the request? The block IO traces are showing a single thread in the ZK process doing all of the writes synchronously. This thread spends about 90% of it's time waiting for disk IO. If the ZK latency timer starts ticking as soon as the request arrives from the client, then that incoming request has to wait for all outstanding requests ahead of it to complete, and that time is accumulated in latency. Effectively, latency includes disk queueing time as well as service time. Does that make any sense? (Maybe this is exactly what Alexander is saying, but I had composed this note by the time I saw his post and had to chime in.) Henry May IBM InfoSphere Streams Performance [hidden email] 720-342-8873 Tie: 963-8873 From: Alexander Shraer <[hidden email]> To: [hidden email] Date: 10/23/2014 10:12 AM Subject: RE: Latency in asynchronous mode I still stay with my previous explanation :) in async mode each client invokes many ops concurrently resulting in a longer queue at the leader On Oct 23, 2014 3:32 PM, "Ibrahim El-sanosi (PGR)" < [hidden email]> wrote: > Thank you Alexander for replay, > > In fact, I use more than one clients (one, two, three, four ......., ten), > in both modes (synchronous and asynchronous). So, I found the latency in > Asynchronous Mode is much higher than latency in synchronous mode. I am > really wondering why I am getting such a big different. > > In synchronous mode, the latency vary between min/avr/max=5/20/50 and > min/avr/max=11/50/120, but it is never reach min/avg/max: 1/371/627 as in > asynchronous mode. > > Any thought? > > Thank you > > -----Original Message----- > From: Alexander Shraer [mailto:[hidden email]] > Sent: Thursday, October 23, 2014 02:14 ã > To: [hidden email] > Subject: Re: Latency in asynchronous mode > > Maybe due to queueing at the leader in asynchronous mode - if in your > experiment you have one client in sync mode the leader has just one op in > the queue at a time On Oct 23, 2014 1:57 PM, "Ibrahim" < > [hidden email]> wrote: > > > Hi folks, > > > > I am testing ZooKeeper latency in Asynchronous mode. I am sending > > update > > (write) requests to Zookeeper cluster that consists of 5 physical > > Zookeeper. > > > > So, when I run the stat command I get high latency like: > > Latency min/avg/max: 7/339/392 > > Latency min/avg/max: 1/371/627 > > Latency min/avg/max: 1/371/627 > > Latency min/avg/max: 1/364/674 > > I guess such high latency correspond to fsync (batch requests). But I > > wish if someone could help me and explain this behaviour. > > > > However, testing Zookeeper using Synchronous mode, it gives me > > reasonable result like: > > Latency min/avg/max: 6/24/55 > > Latency min/avg/max: 7/22/61 > > Latency min/avg/max: 7/30/65 > > > > Note that the latency measures in milliseconds. > > > > I look forward to hearing from you. > > > > Ibrahim > > > > > > > > > > > > > > > > -- > > View this message in context: > > http://zookeeper-user.578899.n2.nabble.com/Latency-in-asynchronous-mod > > e-tp7580446.html Sent from the zookeeper-user mailing list archive at > > Nabble.com. > > > ________________________________ If you reply to this email, your message will be added to the discussion below: http://zookeeper-user.578899.n2.nabble.com/Latency-in-asynchronous-mode-tp7580446p7580451.html To unsubscribe from Latency in asynchronous mode, click here< http://zookeeper-user.578899.n2.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=7580446&code=aS5zLmVsLXNhbm9zaUBuZXdjYXN0bGUuYWMudWt8NzU4MDQ0Nnw1NTE4MjI0Njk= >. NAML< http://zookeeper-user.578899.n2.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml > -- View this message in context: http://zookeeper-user.578899.n2.nabble.com/Latency-in-asynchronous-mode-tp7580446p7580457.html Sent from the zookeeper-user mailing list archive at Nabble.com.