cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Koushik Das" <koushik....@citrix.com>
Subject Review Request 15080: CLOUDSTACK-4855: Throttle based on the # of outstanding requests to the directly managed HV host (direct agents)
Date Wed, 30 Oct 2013 10:51:22 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15080/
-----------------------------------------------------------

Review request for cloudstack, Alex Huang, Chiradeep Vittal, and Darren Shepherd.


Bugs: CLOUDSTACK-4855
    https://issues.apache.org/jira/browse/CLOUDSTACK-4855


Repository: cloudstack-git


Description
-------

Cloudstack sends requests to directly managed HV hosts (direct agents) using the direct agent
thread pool. The size of the pool is determined by global config direct.agent.pool.size defaulted
to 500.

Currently there is no restriction on the number of threads a direct agent can use from this
shared thread pool to send requests to the host. This is fine as long as the host is responding
to requests
in a reasonable amount of time. But if there is a considerable delay in getting response,
the thread remain blocked for that much time. As more commands are send to the slow host threads
keep getting
blocked. This can eventually lead to a situation where requests to healthy hosts cannot be
processed as there are not enough free threads.

The problem being addressed here is to localize the impact of few bad hosts, so that entire
management server is not affected.

One such way is to throttle based on the # of outstanding requests on per host basis. The
outstanding requests to a host will be a % of direct agent pool size. This is configurable
based on
direct.agent.thread.cap. This will ensure that the impacted host will be bound by a upper
cap on the number of threads it can use to process requests and not the entire pool.


Note: The reason for checking the outstanding request count in the Task.run() method is to
take into account cron jobs that gets scheduled at agent startup.


Diffs
-----

  engine/orchestration/src/com/cloud/agent/manager/AgentAttache.java ff35255 
  engine/orchestration/src/com/cloud/agent/manager/AgentManagerImpl.java 3e684cc 
  engine/orchestration/src/com/cloud/agent/manager/DirectAgentAttache.java 7d3f765 

Diff: https://reviews.apache.org/r/15080/diff/


Testing
-------

Verified by tweaking the per agent upper cap to a value of 1 and checked that the requests
are getting scheduled but the executor thread simply bails out.


Thanks,

Koushik Das


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message