zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ibrahim El-sanosi (PGR)" <i.s.el-san...@newcastle.ac.uk>
Subject RE: Latency in asynchronous mode
Date Thu, 30 Oct 2014 20:59:03 GMT
Hi Michael, nice to see you again.


Looking at the following:
Latency min/avg/max: 7/339/392
Latency min/avg/max: 6/24/55

Lets focus on the max,

392 vs 55.

What does the 392 represent?

I still havn’t got the answer yet. But I think as I sent a large number of write requests
one after other in async mode, the Zookeeper groups each 1000 requests and fsync them into
disk once and then complete the process. Therefore, the latency affect by fsync, because the
request will effect by its groups commit, that is why we see a large latency (392).

I hope this makes sense to you!!!!


Ibrahim




From: Michael Segel [mailto:msegel@segel.com]
Sent: Thursday, October 30, 2014 09:57 ص
To: user@zookeeper.apache.org
Cc: zookeeper-user@hadoop.apache.org
Subject: Re: Latency in asynchronous mode

Sorry for the delay, work and travel...

The numbers you posted:

So, when I run the stat command I get high latency like:
Latency min/avg/max: 7/339/392
Latency min/avg/max: 1/371/627
Latency min/avg/max: 1/371/627
Latency min/avg/max: 1/364/674
I guess such high latency correspond to fsync (batch requests). But I wish
if someone could help me and explain this behaviour.

However, testing Zookeeper using Synchronous mode, it gives me reasonable
result like:
Latency min/avg/max: 6/24/55
Latency min/avg/max: 7/22/61
Latency min/avg/max: 7/30/65



Looking at the following:
Latency min/avg/max: 7/339/392
Latency min/avg/max: 6/24/55

Lets focus on the max,

392 vs 55.

What does the 392 represent?

The interesting thing is the min values of the latency numbers of the async. 1ms?
But that's a different issue.

So lets start there.


-Mike

On Oct 25, 2014, at 6:44 PM, Ibrahim <i.s.el-sanosi@newcastle.ac.uk<mailto:i.s.el-sanosi@newcastle.ac.uk>>
wrote:


Michael,

Ok,
In part, the question is what are you actually seeing when you look at the numbers. Which
numbers do you mean?

Thank you

From: Michael Segel [via zookeeper-user] [mailto:ml-node+s578899n7580471h25@n2.nabble.com]
Sent: Sunday, October 26, 2014 12:40 ص
To: Ibrahim El-sanosi (PGR)
Subject: Re: Latency in asynchronous mode

Hi,

I went back to the first email in this thread.

Which is why I asked if you understood the difference between synchronous and asynchronous
communication.
You may understand it, but at the sometime not understand it.

In part, the question is what are you actually seeing when you look at the numbers.


On Oct 25, 2014, at 8:16 PM, Ibrahim El-sanosi (PGR) <[hidden email]</user/SendEmail.jtp?type=node&node=7580471&i=0>>
wrote:


Hi Michael,

Thank you for  your response.

No, I do understand the different between synchronous and asynchronous communication. The
question you are looking at is not my primary question, can you please check the main question
that I post. Again, the question you have answered is my replay to one of the user. Also it
is useful to follow the people replay to my question in order to become more familiar.

Thank you

Ibrahim

-----Original Message-----
From: Michael Segel [mailto:[hidden email]</user/SendEmail.jtp?type=node&node=7580471&i=1>]
Sent: Saturday, October 25, 2014 08:06 م
To: [hidden email]</user/SendEmail.jtp?type=node&node=7580471&i=2>
Subject: Re: Latency in asynchronous mode

Hi,

I am afraid I don’t understand your question.

Do you not understand the difference between synchronous and asynchronous communication?

Look: Synchronous… I’m not going to do anything until I hear from you or I time out and
resend my request.
Think of having a phone conversation. You say something and then wait for a response.

Asynchronous… I’m going to send a bit of information and then go on and do something else
and not wait for a response.
Think of writing a post-it note and leaving on the fridge for your wife to find. Or leaving
a voice mail message that you’re heading out to the pub for a quick drink and you will be
late to dinner. ;-)

Ok… I realize I’m stating the obvious… but that really should explain what you are seeing.
 The message is sent and then ZK goes on doing something else… and the response is somewhere
in the queue to be processed at a later time.  What’s wrong with that?

Your own results show that the more activity ZK is doing, the longer the delay in receiving
the ACK from the response.

-Mike

On Oct 23, 2014, at 7:21 PM, Ibrahim El-sanosi (PGR) <[hidden email]</user/SendEmail.jtp?type=node&node=7580471&i=3>>
wrote:


Hi Rakesh,

First of all, the zookeeper ensemble consists of five Zookeeper servers. Also I have another
10 clients machines used to send write requests to Zookeeper. The benchmark code creates 5
threads (equal to number of Zookeeper server) , each thread associates with one Zookeeper
server. So, in this case, each zookeeper server will receive a set of write requests. The
benchmark code runs for 30 seconds.

Async tests:

* Number of clients
In fact, I have different test, each test has different number of clients. For example, the
bellow shows the latency corresponds to different number of clients:
Five clients: Latency min/avg/max: 235/366/515 Ten clients:  Latency
min/avg/max: 252/368/505

* Number of threads
As explained above, each client creates 5 threads and each thread connects to one Zookeeper
server. For instance, test using 5 clients’ machines, each Zookeeper server receives five
threads.

* data size storing in each znode
The data size store in znode is 100 bytes

Also, it would be good to monitor :

1) JVM stats(one way is through JMX) like heap, gc activities. This is to see if latency spike
corresponds to gc activity or not.

If you mean by JVM stats the four word stat command, then  the latency result showed above
is generated using this command. If you mean something else then I have to read about and
tell you late on.

2) Since you are doubting fsync, I think $ iostat would be helpful to see disk statistics.
For example, $ iostat -d -x 2 10 and collects the disk latency.

Yes, the batch size that I use in SyncrequestProcessor class is 1000 requests. I think this
is preferable size. Also, I will try to use iostat.

3) CPU usage through top or sar unix commands. I didn't use sar , but I could see it gives
more details like percent of CPU running idle with a process waiting for block I/O etc.

Yes, I will use the top command to gathering the resource utilization. However, I don’t
think top or sar will answer my question. Because I am thinking there is different between
Asynchroned and Synchronized mode for measuring the latency.

Thank you for your attention

I look forward to hearing from you


Ibrahim

-----Original Message-----
From: Rakesh Radhakrishnan [mailto:[hidden email]</user/SendEmail.jtp?type=node&node=7580471&i=4>]
Sent: Thursday, October 23, 2014 03:58 م
To: [hidden email]</user/SendEmail.jtp?type=node&node=7580471&i=5>
Subject: Re: Latency in asynchronous mode

Hi Ibrahim,

In async tests, could you give the details like:

* number of clients
* number of threads
* data size storing in each znode

Also, it would be good to monitor :

1) JVM stats(one way is through JMX) like heap, gc activities. This is to see if latency spike
corresponds to gc activity or not.

2) Since you are doubting fsync, I think $ iostat would be helpful to see disk statistics.
For example, $ iostat -d -x 2 10 and collects the disk latency.

3) CPU usage through top or sar unix commands. I didn't use sar , but I could see it gives
more details like percent of CPU running idle with a process waiting for block I/O etc.


-Rakesh


On Thu, Oct 23, 2014 at 6:44 PM, Alexander Shraer <[hidden email]</user/SendEmail.jtp?type=node&node=7580471&i=6>>
wrote:


Maybe due to queueing at the leader in asynchronous mode - if in your
experiment you have one client in sync mode the leader has just one
op in the queue at a time On Oct 23, 2014 1:57 PM, "Ibrahim"
<[hidden email]</user/SendEmail.jtp?type=node&node=7580471&i=7>> wrote:


Hi folks,

I am testing ZooKeeper latency in Asynchronous mode. I am sending
update
(write) requests to Zookeeper cluster that consists of 5 physical
Zookeeper.

So, when I run the stat command I get high latency like:
Latency min/avg/max: 7/339/392
Latency min/avg/max: 1/371/627
Latency min/avg/max: 1/371/627
Latency min/avg/max: 1/364/674
I guess such high latency correspond to fsync (batch requests). But
I
wish

if someone could help me and explain this behaviour.

However, testing Zookeeper using Synchronous mode, it gives me
reasonable result like:
Latency min/avg/max: 6/24/55
Latency min/avg/max: 7/22/61
Latency min/avg/max: 7/30/65

Note that the latency measures in milliseconds.

I look forward to hearing from you.

Ibrahim







--
View this message in context:
http://zookeeper-user.578899.n2.nabble.com/Latency-in-asynchronous-mo
d
e-tp7580446.html

Sent from the zookeeper-user mailing list archive at Nabble.com.




________________________________
If you reply to this email, your message will be added to the discussion below:
http://zookeeper-user.578899.n2.nabble.com/Latency-in-asynchronous-mode-tp7580446p7580471.html
To unsubscribe from Latency in asynchronous mode, click here<http://zookeeper-user.578899.n2.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=7580446&code=aS5zLmVsLXNhbm9zaUBuZXdjYXN0bGUuYWMudWt8NzU4MDQ0Nnw1NTE4MjI0Njk=>.
NAML<http://zookeeper-user.578899.n2.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>




--
View this message in context: http://zookeeper-user.578899.n2.nabble.com/Latency-in-asynchronous-mode-tp7580446p7580472.html
Sent from the zookeeper-user mailing list archive at Nabble.com<http://nabble.com/>.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message