cloudstack-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Koushik Das (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CLOUDSTACK-4944) Command sequence logic in agent code may lead to errors in clustered MS setup
Date Thu, 26 Dec 2013 09:59:52 GMT

     [ https://issues.apache.org/jira/browse/CLOUDSTACK-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Koushik Das updated CLOUDSTACK-4944:
------------------------------------

    Fix Version/s:     (was: 4.3.0)
                   Future

> Command sequence logic in agent code may lead to errors in clustered MS setup
> -----------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-4944
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-4944
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: Management Server
>    Affects Versions: pre-4.0.0, 4.1.0, 4.2.0
>            Reporter: Koushik Das
>            Assignee: Koushik Das
>             Fix For: Future
>
>
> I was looking at the command sequencing logic in the agent code (AgentAttache.java).
Each agent maintains a sequence that gets initialised based on following logic
>    private static final Random s_rand = new Random(System.currentTimeMillis());
>    _nextSequence = s_rand.nextInt(Short.MAX_VALUE) << 48;
> For every command that gets processed by the agent the sequence is incremented by 1.
If commands are to be executed in sequence then they are queued up based on this sequence
>    protected synchronized void addRequest(Request req) {
>        int index = findRequest(req);
>        assert (index < 0) : "How can we get index again? " + index + ":" + req.toString();
>        _requests.add(-index - 1, req);
>    }
> The above works fine in case of a single MS scenario. In case of a clustered MS setup
things change slightly.
> The command can originate at any MS and based on the ownership of the agent, it gets
forwarded to the correct MS which then handles the command. Now command sequences are local
to individual agents in MS. In this case the originating MS agent tags the request with a
sequence. This gets forwarded to the owning MS and based on if 'executeInSequence' flag is
set, gets added to the list based on the sequence number. Now here lies the problem, commands
are not inserted in the order in which they arrive but based on the sequence number. In case
of a forwarded command the sequence is different from the local sequence. If the starting
sequence of forwarded commands is much less than that of the locally generated commands then
there is a possibility of local commands getting starved if there is a steady arrival of forwarded
commands. Similarly it can also happen the other way round. Also if the the starting sequence
for a agent in local and peer MS is not spread far apart then there may be overlaps and a
new request will override the old one.
> Not sure if anyone encountered any issues due to this. The correct way looks like to
implement the queue model (FIFO) rather than doing a add based on the above code.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message