deltacloud-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Lalancette <>
Subject Re: Length of instance names in Deltacloud
Date Fri, 03 Jun 2011 13:46:31 GMT
On 06/02/11 - 10:00:55PM, Tomas Von Veschler wrote:
> Thanks for the detailed explanation, really helps me understand
> better the internals. Some comments bellow:
> On 06/01/2011 10:44 PM, Chris Lalancette wrote:
> >On 06/01/11 - 09:37:20PM, Tomas Von Veschler wrote:
> >>Hi Chris,
> >>
> >>One point that doesn't convince me too much is to use the condor id
> >>as instance name. If for example using rhev, a sys admin will prefer
> >>to have the VMs named like the user named it in Aelous, not really a
> >>bunch of numbers/hash.
> >
> >Actually, I agree with you to a large extent.  However, I have not been able
> >to convince upstream condor (where I have to get condor patches accepted) about
> >this.  The reasoning is that the condor name is used as a unique
> >handle to the instance.  Consider:
> >
> >1)  Aeolus submits the job to condor.
> >2)  Condor generates the name (something like Condor_<hostname>_uniquejobid).
> >3)  Condor writes the name to the internal database; note that it does *not*
> >have an ID yet, because we haven't submitted the job to deltacloudd yet.
> >4)  Condor submits the job to deltacloudd.
> >5)  Before it can get the status back from deltacloudd, condor crashes.
> >6)  On restart, condor can find the job again by looking up the unique name
> >that it generated.  It can then start monitoring the job again.
> I understand the problem now. A question, after it retrieves the job
> (=vm info?) in (6), does it switch to use VM IDs in advance or does
> it keep searching by name?

Yeah, it actually uses the ID once it knows it.  The name is only used if the
crash situation happens.

> >It's not possible to do 6) if you use the user-generated name, because it is
> >not necessarily unique enough.
> But what about the triplet: provider_id # realm_id # vm_name ? For
> rhev that'd be unique. I mean Condor would have a way to reach the
> job again by decomposing the triplet.

So what I've done here is to actually change condor to generate UUIDs.  Those
end up being 36 bytes long, which is short enough for RHEV-M and most other
clouds.  For the clouds where this is still too long, we will just truncate the
UUID as appropriate.  It results in some loss of uniqueness, but I think it
should be OK for the most part.

> To note: it's possible to change the name of a VM at the virt layer.
> That's why using vm ids sounds more robust than names to me.

Yeah, this is a problem.  If the user goes in after condor launches the job
and changes the name, it could cause problems.  But the name isn't the only
thing that could cause a problem here; there are many things that a user could
do to a VM "out-of-band" that would cause us to fail.  For the moment we are
defining it away, but it is something we will eventually have to deal with.

> Hope I'm not overcomplicating things, if we talk about managing
> hundreds+ VMs, the name of VMs at the virt layer is much less
> important than at an small rhev deployment (where btw probably won't
> use cloud anyway, or Aeolus target is also smaller virt deployments?
> I don't know :-)

I'm honestly not sure.  I don't quite know the benefit of managing your own
cloud at a very small scale; it would seem you would just want to use the
underlying virt platform.  But that doesn't mean I'm right :).

It's all a bit of a thorny problem, which is why it hasn't been solved yet.
I'm not entirely satisfied with the solution I outlined above, but at the
moment I don't see a better way around it.

Chris Lalancette

View raw message