cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Tutkowski <>
Subject Re: [QUESTION] Attached/Detached Volume State
Date Tue, 27 Jan 2015 00:14:57 GMT
This actually appears to be a more general-purpose issue than I originally

It seems we often perform checks on the state of a given object before we
submit a job to the job queue. This is fine because it is nice if we can
fail early.

However, some of these properties we check can change between the time we
first check them and when the job is pulled from the job queue and executed.

For example, checking if the volume is a root disk before submitting the
command to the job queue is pretty safe. This property (as far as I know)
is immutable.

However, checking if the volume is attached to a VM before submitting the
job queue is trickier (because that property is mutable).

Maybe this isn't the best example, but let's say you accidentally submit
two requests to attach a single volume to a single VM. In this case, the
pre-checks all pass (including the fact that this volume is not currently
attached to a VM) and the commands are submitted to the job queue.

The job engine executes the first attach command and it works. Then it
comes to the second attach command and it gets much farther along in the
process than it should: it sends the command all the way to the hypervisor
and receives feedback that the VDI is already in use by the specified VM.

In the case of the second attach command, it would be nice if we re-checked
the immutable properties before doing any real work.

I think at one point we attempted to do this by having the entry point
that's given to the job queue be the same method that's called from the API
layer when the command first comes in (since we were running the same
method later from the job queue as the API layer already ran, all of our
checks would be re-checked). We had some code in this method to know if it
was being called from the job engine (so that it wouldn't simply submit
another command to the job engine).

I'm not sure why we moved away from that approach. Does anyone know?


On Sat, Jan 24, 2015 at 12:38 AM, Mike Tutkowski <> wrote:

> I just returned to this problem a little while ago today.
> The original reason I asked this question was because I noticed an issue
> when attaching multiple volumes to a VM at the same time.
> The attach logic is properly funneled through the VM job queue, but it
> still fails (for the second attach command, which is executed right after
> kicking of the attach command for the first volume).
> As it turns out, the device ID - if not explicitly passed in - is acquired
> by logic that runs BEFORE the attach command is submitted to the job queue.
> What this means is you can have two attach commands running at the same
> time and both can get the same "next" device ID (the logic is not
> serialized until submitted to the job queue and the device ID that's
> actually used by the hypervisor is not recorded in the DB until the
> hypervisor returns that the volume was successfully attached). As such, the
> second command that's sent to the VM to attach a volume has a device ID
> that's already in use (it just became in use a moment earlier).
> I rectified this situation by moving the call to get the "next" device ID
> to a location that's invoked from the job queue. This way the two commands
> will get unique device IDs.
> Although this race condition has most likely been in the code for a while,
> it's not likely to manifest itself to "non-managed" storage. The reason is
> that non-managed-storage logic doesn't have to issue a command over the
> network to have the SAN put the volume to attach in the correct ACL
> ("grantAccess" logic). Since managed storage has this work to do, it ends
> up being just a little slower to attach a volume to a VM than for
> non-managed storage. When you take the extra latency away, I've never been
> able to get this race condition to surface.
> Either way, it was a race condition and now it's fixed.
> On Tue, Jan 13, 2015 at 6:51 PM, Nitin Mehta <>
> wrote:
>> +Min.
>> Unfortunately, I don’t think the framework is enhanced for all the
>> different kinds of resources, but it should be the way to go.
>> IMHO  Serialization through states was/is just a hacky way of getting
>> around the situation and should be discontinued.
>> Ideally, state of a resource should reflect only its lifecycle not the
>> operations such as snapshotting, migrating etc.
>> Thanks,
>> -Nitin
>> On 13/01/15 4:32 PM, "Mike Tutkowski" <>
>> wrote:
>> >It appears that the job queue is used for some commands while for others
>> >it
>> >is not.
>> >
>> >Is the intend of the job queue to only serialize operations that are sent
>> >to VMs?
>> >
>> >On Tue, Jan 13, 2015 at 3:14 PM, Mike Tutkowski <
>> >> wrote:
>> >
>> >> This is 4.6.
>> >>
>> >> It seems like our state-transitioning logic is intended (as one might
>> >> expect) to protect the object in question from transitions that are
>> >>invalid
>> >> given it's current state (this is what I would expect).
>> >>
>> >> I do not see, say, the attach and detach operations being serialized.
>> It
>> >> seems they are running simultaneously.
>> >>
>> >> On Tue, Jan 13, 2015 at 2:09 PM, Nitin Mehta <>
>> >> wrote:
>> >>
>> >>> States shouldn¹t be used to serialize operations on a volume. It
>> >>>should be
>> >>> used to denote the lifecycle of the volume instead.
>> >>> I think the async job manager does take care of the serialization.
>> >>>Which
>> >>> version do you see this issue happening ?
>> >>>
>> >>> Thanks,
>> >>> -Nitin
>> >>>
>> >>> On 13/01/15 12:28 PM, "Mike Tutkowski" <>
>> >>> wrote:
>> >>>
>> >>> >Hi,
>> >>> >
>> >>> >Does anyone know why we don't currently have a state and applicable
>> >>> >transitions in Volume.State for attaching and detaching volumes?
>> >>> >
>> >>> >It seems like you'd want to, say, transition to Attaching only when
>> >>> you're
>> >>> >in the Ready state (or maybe some other states, as well).
>> >>> >
>> >>> >I think right now you can confuse the system by sending an attach
>> >>>command
>> >>> >and then a detach command before the attach command finishes (it's
>> >>>race
>> >>> >condition...I don't think it always causes trouble).
>> >>> >
>> >>> >Thoughts?
>> >>> >
>> >>> >Thanks,
>> >>> >Mike
>> >>> >
>> >>> >--
>> >>> >*Mike Tutkowski*
>> >>> >*Senior CloudStack Developer, SolidFire Inc.*
>> >>> >e:
>> >>> >o: 303.746.7302
>> >>> >Advancing the way the world uses the cloud
>> >>> ><>* *
>> >>>
>> >>>
>> >>
>> >>
>> >> --
>> >> *Mike Tutkowski*
>> >> *Senior CloudStack Developer, SolidFire Inc.*
>> >> e:
>> >> o: 303.746.7302
>> >> Advancing the way the world uses the cloud
>> >> <>*™*
>> >>
>> >
>> >
>> >
>> >--
>> >*Mike Tutkowski*
>> >*Senior CloudStack Developer, SolidFire Inc.*
>> >e:
>> >o: 303.746.7302
>> >Advancing the way the world uses the cloud
>> ><>*™*
> --
> *Mike Tutkowski*
> *Senior CloudStack Developer, SolidFire Inc.*
> e:
> o: 303.746.7302
> Advancing the way the world uses the cloud
> <>*™*

*Mike Tutkowski*
*Senior CloudStack Developer, SolidFire Inc.*
o: 303.746.7302
Advancing the way the world uses the cloud

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message