incubator-cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suresh Sadhu <Suresh.Sa...@citrix.com>
Subject Things to consider for Campo
Date Mon, 13 Aug 2012 11:15:51 GMT
HI All,

As I heard , Campo has major architecture changes involved. It will be good if we consider
the following items for better improvement.so that it will help QA/Support and customers.
Also it will  minimize  support calls count.

Also please feel to add if I miss any data points or you feel you can add few more points
for improvements to the below list... kindly correct me if my assumption/views are wrong.

-

Job in waiting state
*****************
--- we don't fix the time to job completion ..because we don't know how much time  will it
 take to complete  a particular job But due to this design any initials job went in loop/infinite
then other jobs are queued and wait for first job to finish.

The only way to come out of this situation is ..manually update the field status in the DB.

Is there any alternate(better) way to overcome the above problem... please share your view
and thoughts

MY though:
If we put job priority/ Job waiting period  as configurable parameters  and  end user can
set/update the priority based on his needs and also waiting period.so that even one job in
waiting state based on priority other waiting job needs to trigger.

In Current design if one job is in waiting state.. end user can't stop the job.
So if we introduce configurable parameters  so the job in waiting(hanged state ) can be come
out after configured duration over /expired.



Issue no# http://bugs.cloud.com/show_bug.cgi?id=12061
Job fails/retry mechanism :
********************
If any job fails  due to some exception we don't try  after some time.

Like example:
[ It's not accurate example but gives some info]

In Vmware case: you can't take snapshot  on root and data disk of vm at the same time. If
you try to trigger the snapshot on both disk on same time.
First request will be succeeded and second request will failed with proper limitation message.

Again end user has to initiate the snapshot on another disk(i. datadisk)

My Thought:
It will be good if we keep the failed job in queue and once first job completes ..Job manager
should take/consider waiting job(failed job) in queue and process it.

Issue no# http://bugs.cloud.com/show_bug.cgi?id=11531


Please feel free to add few more data points here.

Usability in terms of UI refresh:
************************
CS has still has caching issue until and unless you manually click on refresh button. Sometimes
you still see the cached values.


Issue no#http://bugs.cloudstack.org/browse/CS-14988


Error &Exception Handling & coordination between the tasks on same resource.
***************************************************************
I don't have much data points .if anybody has please share your views.

But will give one example:

Problem:

Power on stopped VM and at the same time perform snapshot on root disk- Fail(deploy VM failed
with lock problem-Java.lang.exception occurred but snapshot jib completed successfully and
tried again startVM this time its deployed successfully.)please check the attached log and
execution logs.

Limitation:

This is not a problem under current architecture. We currently don't coordinate tasks but
to throw runtime errors, when a snapshot task is being taken, VM operation may be temporarily
unavailable to user and user needs to retry




Regards

Sadhu









Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message