airavata-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sanjaya Medonsa <sanjaya...@gmail.com>
Subject Re: Apache Airavata-OODT Integration
Date Mon, 17 Jun 2013 12:22:16 GMT
Thanks Chris. I'll update the implementation to use file name instead of
OODT product id.

Cheers,
Sanjaya


On Sun, Jun 16, 2013 at 12:51 AM, Mattmann, Chris A (398J) <
chris.a.mattmann@jpl.nasa.gov> wrote:

> Hey Sanjaya, sure +1 use the Filename. It's not guaranteed to be unique,
> but you can easily just pop the first one off the top (latest) and take
> that (since it's sorted by product received time). You may check out the
> pcs-core module and some of its internal classes like FileManagerUtils
> to see some cool helper functions that could aid in this regard.
>
> Cheers,
> Chris
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: chris.a.mattmann@nasa.gov
> WWW:  http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
>
> -----Original Message-----
> From: Sanjaya Medonsa <sanjayamrt@gmail.com>
> Reply-To: "dev@airavata.apache.org" <dev@airavata.apache.org>
> Date: Saturday, June 15, 2013 4:04 AM
> To: Airavata Dev <dev@airavata.apache.org>
> Subject: Re: Apache Airavata-OODT Integration
>
> >Thanks Chris for your help! Working directory is available in
> >JobExecutionContext in Airavata and directory can easily be retrieved.
> >Issue in my case is that, from XBaya GUI I take product id as input not
> >the
> >file name. Internally file stager query the file manager using product id
> >to retrieve product reference and corresponding file name to stage the
> >file
> >into input dir. Since this product id to file name mapping happens
> >internally during the file staging, my implementation don't have access to
> >filename unless I query the file manager to retrieve the corresponding
> >file
> >name using product id.
> >
> >One of the major issue in my implementation seems that I use OODT product
> >id as input, not the file name. Should I change my implementation to use
> >file name instead of product id ?
> >
> >Best Regards,
> >Sanjaya
> >
> >
> >On Fri, Jun 14, 2013 at 8:51 PM, Mattmann, Chris A (398J) <
> >chris.a.mattmann@jpl.nasa.gov> wrote:
> >
> >> Hey Sanjaya,
> >>
> >> Easy, see the attached PGEConfig.xml here:
> >>
> >> http://paste.apache.org/6OGW
> >>
> >> In that file:
> >>
> >> 1. We compute the staged file path by computing JobDir
> >> 2. We create in the exe block a staged input dir
> >> 3. We stage the files just using cps in the exeBlock (could have
> >> just as easily used fileStager)
> >> 4. We know that the file is [JobInputDir]/[Filename]
> >>
> >> HTH.
> >>
> >> Cheers,
> >> Chris
> >>
> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >> Chris Mattmann, Ph.D.
> >> Senior Computer Scientist
> >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> >> Office: 171-266B, Mailstop: 171-246
> >> Email: chris.a.mattmann@nasa.gov
> >> WWW:  http://sunset.usc.edu/~mattmann/
> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >> Adjunct Assistant Professor, Computer Science Department
> >> University of Southern California, Los Angeles, CA 90089 USA
> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>
> >>
> >>
> >>
> >>
> >>
> >> -----Original Message-----
> >> From: Sanjaya Medonsa <sanjayamrt@gmail.com>
> >> Reply-To: "dev@airavata.apache.org" <dev@airavata.apache.org>
> >> Date: Friday, June 14, 2013 5:02 AM
> >> To: Airavata Dev <dev@airavata.apache.org>
> >> Subject: Re: Apache Airavata-OODT Integration
> >>
> >> >Thanks Chris for your input. I actually use the PGETaskInstance for
> >>file
> >> >staging with minimal additional code. But my issue issue not with the
> >>file
> >> >staging. As per my current implementation, application inputs product
> >>id.
> >> >Then using the capabilities in PGETaskInstance class, it does the file
> >> >staging. But my issue is that during the file staging product is
> >>mapped to
> >> >a file in specified working directory. I don't have a way to retrieve
> >>the
> >> >staged file name, as it is not recorded in Metadata (For this purpose,
> >>I
> >> >query the FileManager again to get the corresponding reference name
> >>for a
> >> >given product id). I need the staged file path, since I modify the
> >>input
> >> >product id into staged file path prior to actual workflow invocation.
> >> >Basically I am looking for some implementation where I can easily
> >> >retrieve,
> >> >staged file path for a given product id.
> >> >
> >> >Cheers,
> >> >Sanjaya
> >> >
> >> >
> >> >On Wed, Jun 12, 2013 at 10:04 PM, Mattmann, Chris A (398J) <
> >> >chris.a.mattmann@jpl.nasa.gov> wrote:
> >> >
> >> >> Hi Sanjaya,
> >> >>
> >> >> -----Original Message-----
> >> >>
> >> >> From: Sanjaya Medonsa <sanjayamrt@gmail.com>
> >> >> Reply-To: "dev@airavata.apache.org" <dev@airavata.apache.org>
> >> >> Date: Monday, June 10, 2013 5:20 PM
> >> >> To: "dev@airavata.apache.org" <dev@airavata.apache.org>
> >> >> Cc: "dev@oodt.apache.org" <dev@oodt.apache.org>
> >> >> Subject: Re: Apache Airavata-OODT Integration
> >> >>
> >> >> >Hi Chris,
> >> >> >       On configuration, I have get rid of all the configuration
> >>files,
> >> >> >including pge-config.xml. All the required configurations are
> >> >> >programmatically set.  Configurations such FileManagerServer URL
are
> >> >> >configured in the airavata-server.properties file. I'll update
the
> >> >>review
> >> >> >request with modified details.
> >> >>
> >> >> Great work!
> >> >>
> >> >>
> >> >> >       Still I am not quite clear on how to retrieve staged file
> >>path
> >> >> >properly. Currently I am using getStagedFilePath method
> >> >> >in ApacheAiravataWorkFlowInstanceImpl to regenerate the staged
file
> >> >>path.
> >> >> >While I am going through the OODT code that I have seen method
in
> >> >> >DataTransferer to notify FileManagerServer once transfer is
> >>completed.
> >> >>But
> >> >> >I couldn't see the same for product retrieval.
> >> >>
> >> >> Example:
> >> >>
> >> >>
> >>
> >>
> http://svn.apache.org/repos/asf/oodt/trunk/pge/src/test/resources/pge-con
> >> >>fi
> >> >> g.xml
> >> >>
> >> >>
> >> >> Review Board tickets:
> >> >> https://reviews.apache.org/r/4746/
> >> >>
> >> >> https://reviews.apache.org/r/5382/
> >> >>
> >> >>
> >> >> JIRA issue source (in OODT since 0.4):
> >> >>   https://issues.apache.org/jira/browse/OODT-443
> >> >>
> >> >>
> >> >> >       As you suggested I'll improve my workflow using Apache Tika.
> >>I'd
> >> >> >like to continue this as an Parallal task. While modifying staging
> >> >> >implementation based on community feedback, currently I am looking
> >>at
> >> >> >ingesting output back to OODT.
> >> >>
> >> >> See above for info on file staging. I would strongly encourage you
> >>not
> >> >> to reimplement CAS-PGE in Airavata -- it's pretty functional and
> >> >>expressive
> >> >> anyways and I would work to figure out how to make Airavata leverage
> >> >> CAS-PGE.
> >> >>
> >> >> Cheers,
> >> >> Chris
> >> >>
> >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >> >> Chris Mattmann, Ph.D.
> >> >> Senior Computer Scientist
> >> >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> >> >> Office: 171-266B, Mailstop: 171-246
> >> >> Email: chris.a.mattmann@nasa.gov
> >> >> WWW:  http://sunset.usc.edu/~mattmann/
> >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >> >> Adjunct Assistant Professor, Computer Science Department
> >> >> University of Southern California, Los Angeles, CA 90089 USA
> >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >> >>
> >> >>
> >> >>
> >> >> >
> >> >> >
> >> >> >
> >> >> >On Wed, Jun 5, 2013 at 12:11 AM, Mattmann, Chris A (398J) <
> >> >> >chris.a.mattmann@jpl.nasa.gov> wrote:
> >> >> >
> >> >> >> Hi Sanjaya,
> >> >> >>
> >> >> >> I think starting out with /bin/ls would be good, maybe like
a
> >>/bin/ls
> >> >> >> workflow, and then for each file returned, maybe run Apache
Tika
> >>and
> >> >> >> extract its metadata and then pipe that to a file?
> >> >> >>
> >> >> >> How about that?
> >> >> >>
> >> >> >> Cheers,
> >> >> >> Chris
> >> >> >>
> >> >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >> >> >> Chris Mattmann, Ph.D.
> >> >> >> Senior Computer Scientist
> >> >> >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> >> >> >> Office: 171-266B, Mailstop: 171-246
> >> >> >> Email: chris.a.mattmann@nasa.gov
> >> >> >> WWW:  http://sunset.usc.edu/~mattmann/
> >> >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >> >> >> Adjunct Assistant Professor, Computer Science Department
> >> >> >> University of Southern California, Los Angeles, CA 90089 USA
> >> >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> -----Original Message-----
> >> >> >> From: Sanjaya Medonsa <sanjayamrt@gmail.com>
> >> >> >> Reply-To: "dev@airavata.apache.org" <dev@airavata.apache.org>
> >> >> >> Date: Tuesday, June 4, 2013 5:31 AM
> >> >> >> To: "dev@airavata.apache.org" <dev@airavata.apache.org>
> >> >> >> Cc: "dev@oodt.apache.org" <dev@oodt.apache.org>
> >> >> >> Subject: Re: Apache Airavata-OODT Integration
> >> >> >>
> >> >> >> >Hi Chris,
> >> >> >> >     Please see my comments below on the two items.
> >> >> >> >
> >> >> >> >Configuration : It should be possible to set them
> >>programmatically.
> >> >> >> >Actually I have implemented partly it for file staging
> >>information.
> >> >> >>I'll
> >> >> >> >work to get rid of the other configuration files.
> >> >> >> >
> >> >> >> >Staged File Path : I'll work on the suggested approach,
though I
> >>am
> >> >>not
> >> >> >> >fully understand it at the moment. I guess I need to go
through
> >>bit
> >> >> >>more
> >> >> >> >on
> >> >> >> >CAS-PGE and come back to you on the proposed approach.
> >> >> >> >
> >> >> >> >Currently I am testing this by wrapping /bin/ls command
as GFac
> >> >> >>service. I
> >> >> >> >may need to test this with real workflow. Could you please
> >>provide
> >> >>me
> >> >> >>know
> >> >> >> >some guidance on better scenario to test this.
> >> >> >> >
> >> >> >> >Cheers,
> >> >> >> >Sanjaya
> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >> >On Mon, Jun 3, 2013 at 8:17 PM, Mattmann, Chris A (398J)
<
> >> >> >> >chris.a.mattmann@jpl.nasa.gov> wrote:
> >> >> >> >
> >> >> >> >> Hi Sanjaya,
> >> >> >> >>
> >> >> >> >> -----Original Message-----
> >> >> >> >>
> >> >> >> >> From: Sanjaya Medonsa <sanjayamrt@gmail.com>
> >> >> >> >> Reply-To: "dev@airavata.apache.org" <dev@airavata.apache.org>
> >> >> >> >> Date: Thursday, May 30, 2013 5:12 AM
> >> >> >> >> To: "dev@oodt.apache.org" <dev@oodt.apache.org>,
> >> >> >> >>"dev@airavata.apache.org"
> >> >> >> >> <dev@airavata.apache.org>
> >> >> >> >> Subject: Apache Airavata-OODT Integration
> >> >> >> >>
> >> >> >> >> >Hi,
> >> >> >> >> >     I have worked on the Apache Airavata integration
with
> >>Apache
> >> >> >> >>OODT. As
> >> >> >> >> >a first step, I have implemented integration
with Apache OODT
> >> >>file
> >> >> >> >> >manager component.
> >> >> >> >>
> >> >> >> >> Great work!!
> >> >> >> >>
> >> >> >> >> Comments below:
> >> >> >> >>
> >> >> >> >> >      1. Introduce a new GFac Schema type called
OODTProduct
> >> >>which
> >> >> >> >>takes
> >> >> >> >> >APache OODT product IDs as input.
> >> >> >> >> >      2. Implemented new pre GFac Handler by
extending Apache
> >> >>OODT
> >> >> >> >> >PgeTaskInstance to stage the corresponding file
into the
> >>working
> >> >> >> >> >directory.
> >> >> >> >> >      3. Once file is staged, input parameter
with OODT
> >>product
> >> >>id
> >> >> >>is
> >> >> >> >> >replaced with path of the staged file for downstream
> >>processing
> >> >> >> >> >
> >> >> >> >> >I have tested the implementation with Gfac application
which
> >> >>wraps
> >> >> >> >>/bin/ls
> >> >> >> >> >command. Application takes product id as input
and stage
> >> >> >>corresponding
> >> >> >> >> >file
> >> >> >> >> >into the working directory and /bin/ls is executed
against the
> >> >> >>staged
> >> >> >> >> >file.
> >> >> >> >> >Hope this is a valid testing scenario.
> >> >> >> >> >
> >> >> >> >> >Concerns
> >> >> >> >> >- Configurations : I have added new configuration
file named
> >>and
> >> >> >> >> >oodt-integration.properties in addition to
> >>dynamic_metadata.met
> >> >>and
> >> >> >> >> >pge-config.xml files used by OODT. But at the
moment there is
> >>no
> >> >> >>item
> >> >> >> >> >configured with the oodt-integration.properties.
> >> >> >> >>
> >> >> >> >> You probably only need the pge-config.xml file. Dynamic
> >>metadata,
> >> >>and
> >> >> >> >>the
> >> >> >> >> task configuration properties can be specified
> >>programmatically,
> >> >> >>right?
> >> >> >> >>
> >> >> >> >> >- Staged File Name - With the current implementation
of
> >> >> >> >>PgeTaskInstance it
> >> >> >> >> >is not possible to retrieve path of the staged
file. Due to
> >>this
> >> >> >> >> >limitation, I have query the FileManagerServer
with product id
> >> >>and
> >> >> >> >> >retrieve
> >> >> >> >> >the file name and computed the file path using
information of
> >> >> >>working
> >> >> >> >> >directory.
> >> >> >> >>
> >> >> >> >> I'm not sure I understand this? If you store and
record the
> >> >>Filename,
> >> >> >> >>and
> >> >> >> >> FileLocation
> >> >> >> >> metadata files, then you can easily retrieve the
staged file
> >>path
> >> >> >>via a
> >> >> >> >> SQLquery
> >> >> >> >> via CAS-PGE by simply setting the
> >> >>FORMAT=('$FileLocation/$Filename')
> >> >> >>in
> >> >> >> >> the response.
> >> >> >> >> Can you comment on this?
> >> >> >> >>
> >> >> >> >> >- Currently it is not possible to execute the
workflow using
> >> >>Xbaya
> >> >> >>due
> >> >> >> >>to
> >> >> >> >> >validation failure due to new schema type. I
have commented
> >>out
> >> >>the
> >> >> >> >> >relevant validation code for testing purpose.
> >> >> >> >>
> >> >> >> >> OK, will probably need to work on this.
> >> >> >> >>
> >> >> >> >> >
> >> >> >> >> >Currently I am having an issue with review board
client tool
> >>and
> >> >> >>need
> >> >> >> >>to
> >> >> >> >> >resolve it to upload the code for review.
> >> >> >> >>
> >> >> >> >> I see later that you got this working, so will head
over and
> >> >>review
> >> >> >>that
> >> >> >> >> now.
> >> >> >> >>
> >> >> >> >> Thanks!
> >> >> >> >>
> >> >> >> >> Cheers,
> >> >> >> >> Chris
> >> >> >> >>
> >> >> >> >>
> >>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >> >> >> >> Chris Mattmann, Ph.D.
> >> >> >> >> Senior Computer Scientist
> >> >> >> >> NASA Jet Propulsion Laboratory Pasadena, CA 91109
USA
> >> >> >> >> Office: 171-266B, Mailstop: 171-246
> >> >> >> >> Email: chris.a.mattmann@nasa.gov
> >> >> >> >> WWW:  http://sunset.usc.edu/~mattmann/
> >> >> >> >>
> >>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >> >> >> >> Adjunct Assistant Professor, Computer Science Department
> >> >> >> >> University of Southern California, Los Angeles, CA
90089 USA
> >> >> >> >>
> >>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >> >> >> >>
> >> >> >> >>
> >> >> >> >>
> >> >> >> >>
> >> >> >>
> >> >> >>
> >> >>
> >> >>
> >>
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message