airavata-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Saminda Wijeratne <samin...@gmail.com>
Subject Re: Persisting GFac job data
Date Tue, 21 May 2013 19:51:26 GMT
Thanks for the feedback Amila. a few comments inline


On Tue, May 21, 2013 at 12:29 PM, Amila Jayasekara
<thejaka.amila@gmail.com>wrote:

> Hi Saminda,
>
> Great suggestion. Also +1 for Dhanushka's proposal to have
> serialize/de-serilized data.
> Few suggestions,
> 1. In addition to successful/error statuses we need other status for nodes
> & workflows
> and workflows.
> E . g :-
>    node - started, submitted, in-progress, failed, successful etc ...
>
Sorry if I was too vague. Yes we have more fine-grain statuses for workflow
and node[1]. We will have a much fine-grained level of granuality for a
GFacJob status.
    public static enum GFacJobStatus{
        SUBMITTED, //job is submitted, possibly waiting to start executing
        EXECUTING, //submitted job is being executed
        CANCELLED, //job was cancelled
        PAUSED, //job was paused
        WAITING_FOR_DATA, // job is waiting for data to continue executing
        FAILED, // error occurred while job was executing and the job
stopped
        FINISHED, // job completed successfully
        UNKNOWN // unknown status. lookup the metadata for more details.
    }


2. This data will be useful in implementing FT and Load Balancing in each
> component. Sometime back we had discussions to make GFac stateless. So who
> is going to populate this data structure and persist it ?
>
That is a very good question... :). This summer is going to be a long
one... ;)

1.
https://svn.apache.org/repos/asf/airavata/trunk/modules/workflow-model/workflow-model-core/src/main/java/org/apache/airavata/workflow/model/graph/Node.java

>
> Thanks
> Amila
>
>
> On Tue, May 21, 2013 at 11:39 AM, Saminda Wijeratne <samindaw@gmail.com
> >wrote:
>
> > Thats is an excellent idea. We can have the job data field to be the
> > designated GFac job serialized data. The whatever GFacProvider should
> > adhere to it.
> >
> > I'm still inclined to have the rest of the fields to ease of querying for
> > the required data. For example if we wanted all attempts on executing
> for a
> > particular node of a workflow or if we wanted to know which application
> > descriptions are faster in execution or more reliable etc. we can let the
> > query language deal with it. wdyt?
> >
> >
> > On Tue, May 21, 2013 at 11:24 AM, Danushka Menikkumbura <
> > danushka.menikkumbura@gmail.com> wrote:
> >
> > > Saminda,
> > >
> > > I think the data container does not need to have a generic format. We
> can
> > > have a base class that facilitate object serialization/deserialization
> > and
> > > let specific meta data structure implement them as required. We get the
> > > Registry API to serialize objects and save them in a meta data table
> > (with
> > > just two columns?) and to deserialize as they are loaded off the
> > registry.
> > >
> > > Danushka
> > >
> > >
> > > On Tue, May 21, 2013 at 8:34 PM, Saminda Wijeratne <samindaw@gmail.com
> > > >wrote:
> > >
> > > > It has being apparent more and more that saving the data related to
> > > > executing a jobs from the GFac can be useful for many reasons such
> as,
> > > >
> > > > debugging
> > > > retrying
> > > > to make smart decisions on reliability/cost etc.
> > > > statistical analysis
> > > >
> > > > Thus we thought of saving the data related to GFac jobs in the
> registry
> > > in
> > > > order to facilitate feature such as above in the future.
> > > >
> > > > However a GFac job is potentially any sort of computing resource
> access
> > > > (GRAM/UNICORE/EC2 etc.). Therefore we need to come up with a
> > generalized
> > > > data structure that can hold the data of any type of resource.
> > Following
> > > > are the suggested data to save for a single GFac job execution,
> > > >
> > > > *experiment id, workflow instance id, node id* - pinpoint the node
> > > > execution
> > > > *service, host, application description ids *- pinpoint the
> descriptors
> > > > responsible
> > > > *local job id* - the unique job id retrieved/generated per execution
> > > > [PRIMARY KEY]
> > > > *job data* - data related executing the job (eg: the rsl in GRAM)
> > > > *submitted, completed time*
> > > > *completed status* - whether the job was successfull or ran in to
> > errors
> > > > etc.
> > > > *metadata* - custom field to add anything user wants
> > > >
> > > > Your feedback is most welcome. The API related changes will also be
> > > > discussed once we have a proper data structure. We are hoping to
> > > implement
> > > > this within next few days.
> > > >
> > > > Thanks,
> > > > Saminda
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message