cloudstack-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CLOUDSTACK-10136) Fix thread growth/leak issue
Date Fri, 10 Nov 2017 16:35:01 GMT

    [ https://issues.apache.org/jira/browse/CLOUDSTACK-10136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16247746#comment-16247746
] 

ASF subversion and git services commented on CLOUDSTACK-10136:
--------------------------------------------------------------

Commit 3ee8d83621c23f976413fdce6d9245197497d504 in cloudstack's branch refs/heads/master from
[~rohit.yadav@shapeblue.com]
[ https://gitbox.apache.org/repos/asf?p=cloudstack.git;h=3ee8d83 ]

CLOUDSTACK-10136: Fix RemoteHostEndPoint thread growth

This fixes the following:
- Unchecked thread growth in RemoteEndHostEndPoint
- Potential NPE while finding EP for a storage/scope

Unbounded thread growth can be reproduced with following findings:
- Every unreachable template would produce 6 new threads (in a single
ScheduledExecutorService instance) spaced by 10 seconds
- Every reachable template url without the template would produce 1 new
thread (and one ScheduledExecutorService instance), it errors out quickly without
causing more thread growth.
- Every valid url will produce upto 10 threads as the same ep (endpoint
instance) will be reused to query upload/download (async callback)
progresses.

Every RemoteHostEndPoint instances creates its own
ScheduledExecutorService instance which is why in the jstack dump, we
see several threads that share the prefix RemoteHostEndPoint-{1..10}
(given poolsize is defined as 10, it uses suffixes 1-10).

This fixes the discovered thread leakage with following notes:
- Instead of ScheduledExecutorService instance, a cached pool could be
used instead and was implemented, and with `static` scope to be reused
among other future RemoteHostEndPoint instances.
- It was not clear why we would want to wait when we've Answers returned
from the remote EP, and therefore a scheduled/delayed Runnable was
not required at all for processing answers. ScheduledExecutorService
was therefore not really required, moved to ExecutorService instead.
- Another benefit of using a cached pool is that it will shutdown
threads if they are not used in 60 seconds, and they get re-used for
future runnable submissions.
- Caveat: the executor service is still unbounded, however, the use-case
that this method is used for short jobs to check upload/download
progresses fits the case here.
- Refactored CmdRunner to not use/reference objects from parent class.

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>


> Fix thread growth/leak issue
> ----------------------------
>
>                 Key: CLOUDSTACK-10136
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-10136
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>    Affects Versions: 4.5.2, 4.6.2, 4.7.1, 4.10.0.0, 4.9.2.0, 4.8.1.1, 4.9.3.0
>            Reporter: Rohit Yadav
>            Assignee: Rohit Yadav
>             Fix For: 4.11.0.0
>
>
> For long running mgmt server with large amounts of templates etc, large amounts of waiting
threads are seen that start with the 'RemoteHostEndPoint-' prefix. These async threads are
responsible mostly for checking template/volume upload/download progress/states. They kick
everytime a template is being checked/downloaded setup etc.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message