airavata-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shameera Rathnayaka <shameerai...@gmail.com>
Subject Re: Orchestrator Real time Job submission improvement
Date Thu, 11 Sep 2014 22:56:52 GMT
Hi Suresh,

Following is how I think we can use suggested improvement to handle real
time scheduling, If user select a resource when he submit the experiment,
In Validate allocation step we can check the job restriction and the
possibility of submitting a new job to given target resource under the
selected username. If there is no space to submit a new job to the target
resource. Then inform it to the user by a message, saying the experiment is
rejected(or failed) because of job count restriction of the target
resource.

If user need to auto-schedule his experiment, then we can move this
experiment to buffered queue and use real time job count details to decide
when it is possible to submit a new job to the target machine or find out a
best fit machine and submit the experiment.

Thanks,
Shameera.


On Thu, Sep 11, 2014 at 2:42 PM, Suresh Marru <smarru@apache.org> wrote:

> Hi Shameera,
>
> Can you please map this to the diagram at [1]? Will the HPCPullMonitor be
> equivalent to the BufferedQueue we discussed on the architecture list?
>
> Suresh
> [1] -
> https://cwiki.apache.org/confluence/display/AIRAVATA/Airavata+Metascheduler
>
> On Sep 11, 2014, at 10:29 AM, Shameera Rathnayaka <shameerainfo@gmail.com>
> wrote:
>
> > Hi devs,
> >
> > I am going to implement the $Subject
> >
> > Requirement: Introduce a max job submission count for a given resource
> under a given username.
> >
> > Abstraction: When user submits a new experiment to the airavata, user
> selects the resource (Machine) where airavata should run that experiment
> (Job). That resource may have job count restriction like under one user
> there can only be have X number of jobs either in Q or R state. So we need
> to handle this at Orchestrator level rather than handing over the
> experiment to GFac to submit the jobs where it gets rejected because of
> that restriction. To do that Orchestrator need to know the job count of
> particular user in that given resource.
> >
> >
> > Implementation:  HPCPullMonitor will write stat data to zookeeper,
> zookeeper path would be something like
> /stat/{username}/{machine}/jobs/{count}. Orchestrator will register a
> watcher for this data change and that watcher will trigger when any GFac
> node(Monitor component) update the job status realtime. Finished jobs will
> immediately decrement the count and these changes will replicate in
> Orchestrator with ZK watches.
> >
> > Thanks,
> > Shameera.
> >
> > --
> > Best Regards,
> > Shameera Rathnayaka.
> >
> > email: shameera AT apache.org , shameerainfo AT gmail.com
> > Blog : http://shameerarathnayaka.blogspot.com/
>
>


-- 
Best Regards,
Shameera Rathnayaka.

email: shameera AT apache.org , shameerainfo AT gmail.com
Blog : http://shameerarathnayaka.blogspot.com/

Mime
View raw message