airavata-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shameera Rathnayaka <shameerai...@gmail.com>
Subject Job throttling implementation clarification.
Date Tue, 23 Sep 2014 17:04:57 GMT
Hi Devs,

I am working on queue based job throttling implementations and here is the
relatedJIRA[1] ticket which is created to track down the implementation
steps.

Following explain how job throttling has been implemented for now. This is
only apply for computer resources has batch queues define with it,
otherwise not.

There is a validator call JobCountValidator, this validator check whether
there is enough space to submit a new job or not and return "true" and
"false" accordingly. I am using zookeeper to track the runtime data like
how many jobs have been submitted to a given host. With the current
implementation job count is increased when the job added to the monitoring
queue and decreased when the job removed from monitoring queue. I ran few
test and this approach is working fine. But after i ran a load test in high
rate i observed that this approach is not working as we are doing
validation in orchestrator and the job count update in gfac. This is due to
a race condition,  Orchestrator can still pass the validation step even we
have submitted allowed max job count to a resource but not yet updated the
job count in zookeeper. Therefore we need to do job submission and job
count increase in the same place to fix that.

So potential place is SimpleOrchestratorImpl#launchExperiment method. WDYT?

As validation and launch operations are called using two client calls still
we have that race condition. i have sent a separate mail for that.

Thanks,
Shameera.

-- 
Best Regards,
Shameera Rathnayaka.

email: shameera AT apache.org , shameerainfo AT gmail.com
Blog : http://shameerarathnayaka.blogspot.com/

Mime
View raw message