chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ari Rabkin (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CHUKWA-391) tuning timeouts and post size
Date Sat, 19 Sep 2009 01:28:16 GMT

    [ https://issues.apache.org/jira/browse/CHUKWA-391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757594#action_12757594
] 

Ari Rabkin commented on CHUKWA-391:
-----------------------------------

In particular:

If agents write too fast, then a collector might not be able to respond to each post before
the agents timeout. When they timeout, they'll retransmit, wastefully.  Ideally, backpressure
from the collector would throttle the agent send rate.

Let n be the number of agents per collector and w be the collector write rate.  Imagine that
all n agents post data at the same time.  For backpressure to work, the maximum post size
needs to be small enough that the collector can respond to each post before any of them time
out. So max post size should be less than w* timeout / n to be useful here.   

Another way to think of that is that we don't do admission control for the write queue collector,
but we do at the agent.  So the agent buffer should fill up and block before the collector
does.

Currently the default max post size is 2 MB, a typical collector writes at 20 MB/sec. So backpressure
doesn't really work at high fan in.  I think max post size should be much smaller; maybe only
a few hundred KB.  This may require some modification to the queue classes to make sure jumbo
chunks work correctly.

Thoughts and comments?

> tuning timeouts and post size
> -----------------------------
>
>                 Key: CHUKWA-391
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-391
>             Project: Hadoop Chukwa
>          Issue Type: Improvement
>          Components: data collection
>            Reporter: Ari Rabkin
>
> The maximum post size, HTTP post timeout, and collector fanin are all related.  We should
at least document this, and ideally autotune.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message