airavata-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sanjaya Medonsa <sanjaya...@gmail.com>
Subject Re: GFac Handlers
Date Thu, 13 Jun 2013 12:08:20 GMT
Hi Amila,
      Following is the link to PGETaskInstance.java in Apache OODT which I
am re using for my implementation. It does a similar implementation for
Apache OODT workflow execution. Here runPge() method does the actual
execution.

http://grepcode.com/file/repo1.maven.org$maven2@org.apache.oodt$cas-pge@0.4@org$apache$oodt$cas$pge$PGETaskInstance.java

Here codes under the comment "Setup the PGE." performs pre execution task
and runIngestCrawler<http://grepcode.com/file/repo1.maven.org/maven2/org.apache.oodt/cas-pge/0.4/org/apache/oodt/cas/pge/PGETaskInstance.java#PGETaskInstance.runIngestCrawler%28org.apache.oodt.cas.crawl.ProductCrawler%29>
method
does the post execution task. With Airavata architecture, same has to be
implemented using two handlers. One for pre execution and one for post
execution. I am also trying to do same thing.

What I am suggesting in high level, is something similar to below.

GFacHandler handler  .......
handler.preExecute(jobContext)

excuteProvider();

handler.postExecute(jobContext)

It's quite similar what currently in Apache Aiaravata, but one wrapper
instance will handle the both pre/post execution. Basically it wraps the
workflow execution.

Best Regards,
Sanjaya






On Wed, Jun 12, 2013 at 6:38 PM, Amila Jayasekara
<thejaka.amila@gmail.com>wrote:

> Hi Sanjaya,
>
> I am also having difficulty understanding your exact problem. What do you
> mean by "wrapper" ?
>
> Further if you want to re-use just the connection, cant you put "connection
> object" (or what ever data structure you used to wrap connection  into job
> execution context and later retrieve it ?
>
> Thanks
> Amila
>
>
> On Wed, Jun 12, 2013 at 8:27 AM, Sanjaya Medonsa <sanjayamrt@gmail.com
> >wrote:
>
> > Thanks Lahiru for your input. I see your point on loose coupling. If we
> > could clearly separate out the task as IN and OUT, then current mechanism
> > would be ideal. But in my case it is actually a wrapper around task
> > execution. I think I have not explained in detail, what I am trying to
> > implement. Basically I am trying to integrate with OODT for input file
> > staging and ingesting the out back into OODT file manager server and
> > metadata catalog. For both of these tasks I need to maintain connection
> > with OODT file manage server. In my case ideal implementation would be an
> > wrapper for task execution. PGETaskInstance is not an heavy weight object
> > and adding it into JobexceutionContext won't be a good solution in my
> case
> > (If we added it into JobexecutionContext, it will remain in memory
> > throughout the execution as well). Actually PGETaskInstance is an similar
> > wrapper implementation for OODT workflow execution and I have seen
> similar
> > kind of implementation in Traverna which is based on interceptors. I have
> > added my changes as review request and added Airavata reviewers into
> > reviewers group. You could have a look at on
> > ApacheAiravataWorkfloeInstanceImpl for what I am trying to achieve. As
> you
> > mentioned, I could implement OUT handler for post processing part and it
> > just need to reinitialize connection/configuration with OODT (I just feel
> > that it is unnecessary and we could have avoid that if we have a wrapper
> > kind of solution). I'll go ahead and complete the integration with OUT
> > wrapper.
> >
> > Best Regards,
> > Sanjaya
> >
> >
> > On Tue, Jun 11, 2013 at 6:47 PM, Lahiru Gunathilake <glahiru@gmail.com
> > >wrote:
> >
> > > Hi Sanjaya,
> > >
> > > Please see my inline comments. I am proposing a solution for your issue
> > > which looks more efficient rather writing two handlers. You can set a
> > this
> > > PGETaskInstance in to Jobexecutioncontext(see the AbstractContext
> class)
> > > and use it in your outHandler, you really don't have to create two
> > > instances if u want to reuse it.
> > > On Tue, Jun 11, 2013 at 8:22 AM, Sanjaya Medonsa <sanjayamrt@gmail.com
> > > >wrote:
> > >
> > > > Hi,
> > > >      As per current design of Handlers, there are two types of
> > handlers.
> > > >         1. IN Handlers
> > > >         2. OUT Handlers
> > > >
> > > > Basically IN handler does the pre processing and out handler does the
> > > post
> > > > processing. With Airavata OODT integration, I am planning to
> implement
> > IN
> > > > handler to perform file staging and out handler perform output
> > ingesting
> > > > into CAS. That means two handler instances to handle pre/post
> > processing.
> > > > In my scenario, this approach seems bit inefficient. Both IN/OUT
> > handlers
> > > > are based on OODT PGETaskInstance. Due to current handler
> > architecture, I
> > > > need to create two instance of PGETaskInstance (One for IN handler
> and
> > > one
> > > > for OUT handler). I guess we could have avoid this situation by
> having
> > > just
> > > > GFac handlers which could either be IN, OUT or IN/OUT. In my case, I
> > > > actually need to implement IN/OUT handler. In high level, I am
> > proposing
> > > > the following approach.
> > > >         1. At configuration level no differentiation on IN/OUT
> handlers
> > > >
> > > Even now there's no difference in IN/OUT Handlers, it becomes IN or OUT
> > > based on how you configure, its same interface, you can use one handler
> > as
> > > IN in one configuration and OUT in another configuration or in the same
> > > configuration.
> > >
> > > >         2. Instead GFacHandler interface should contain two methods
> > > > (preInvoke/postInvoke). Depending on the type of handler either
> > pre/post
> > > > method should be implemented.
> > > >                PRE - preInvoke
> > > >                POST - postInvoke
> > > >                PRE/POST - Both preInvoke/postInvoke
> > > >
> > > IMHO, current approach is more cleaner one which handle loose coupling
> > with
> > > one task at a time. If we do everything in single handler we need to
> keep
> > > these data during whole time of execution. Job execution time could be
> > huge
> > > and we will be keeping all the Handler configuration in memory during
> the
> > > whole execution time. I am not sure what is PGETaskInstance and whether
> > its
> > > efficient to keep this loaded during the whole execution period, other
> > than
> > > loading on demand by configuration.
> > >
> > > >         3. Either we could instantiate all the handlers initially and
> > > > invoke all pre methods prior to task execution and invoke all post
> > > methods
> > > > after task execution.
> > >
> > > If this approach is bit inefficient, then we could
> > > > introduce type into handlers (PRE/POST/PREPOST). Prior to task
> > execution
> > > we
> > > > could instantiate PRE/PREPOST and invoke pre execution method.After
> > task
> > > > execution we could instantiate POST handlers and  invoke
> postExceution
> > > > method for both POST/PREPOST handlers.
> > > >
> > > > I guess Handler may not be the correct name here, we could name these
> > > > handlers as task wrappers as it refers in OODT. Let me know your
> > > feedback.
> > > >
> > > > Cheers,
> > > > Sanjaya
> > > >
> > >
> > >
> > >
> > > --
> > > System Analyst Programmer
> > > PTI Lab
> > > Indiana University
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message