airavata-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shameera Rathnayaka <shameerai...@gmail.com>
Subject Re: Job Submission Limit
Date Mon, 03 Aug 2015 16:04:26 GMT
Hi Chathuri,

IMO validator is not the correct way to do this, basically this is not
(experiment)validation issue at all. This should be handled just before
orchestrator submit process to gfac and hold the process submission until
it get a room in remote computer job scheduling queue.

@Doug

did you consider what happen in orchestrator failure scenario and how to
recover hold process submission requests?. e.g.: orchestrator can be go
down for a couple of minutes(restart)?

Thanks,
Shameera.

On Mon, Aug 3, 2015 at 11:53 AM Chathuri Wimalasena <kamalasini@gmail.com>
wrote:

> Hi Doug,
>
> I think we can check the throttling at validator level. Orchestrator does
> validation for other fields like correct queue name is specified, walltime
> etc. You can add another check to see whether current job count exceeds the
> max job count. For your second question, you can add another field
> to BATCH_QUEUE table of app catalog to keep track of current job count.
>
> Hope that will solve your issue.
>
> Thanks..
> Chathuri
>
> On Mon, Aug 3, 2015 at 10:50 AM, Douglas Chau <dchau3@binghamton.edu>
> wrote:
>
>> Hey Devs,
>>
>> Just wanted to get some input on our plan to implement the queue
>> throttling feature.
>>
>> Batch Queue Throttling:
>> - in Orchestrator the current submit() method in GFACPassiveJobSubmitter
>> publishes to rabbitmq immediately
>> - instead of publishing immediately, we should have the submit method
>> pass the message to a new component, BatchQueueClass (tentative name), to
>> check when we can unload jobs to submit
>>
>> Adding BatchQueueClass
>> - setup a new table(s) to contain computer resource names and their
>> corresponding queues’ current job numbers and maximum job limits
>> - current data models only have information on max job submission limit
>> for a queue but no information on how many jobs are currently submitted, so
>> the idea is to implement a increment/decrement counter in the new table so
>> when a job is submitted, a call to increment the counter will be made, and
>> when a job is finished a call to decrement the counter will be made
>> - once that is complete, BatchQueueClass needs to periodically check the
>> new table to see if the current job number < queue job limit. If it is then
>> we can just pop jobs off to submit them until we hit the job limit
>>
>> How does this sound?
>>
>> Doug
>
>
> --
Shameera Rathnayaka

Mime
View raw message