oodt-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mattmann, Chris A (398J)" <chris.a.mattm...@jpl.nasa.gov>
Subject Re: Apache Airavata-OODT Integration
Date Wed, 12 Jun 2013 16:34:49 GMT
Hi Sanjaya,

-----Original Message-----

From: Sanjaya Medonsa <sanjayamrt@gmail.com>
Reply-To: "dev@airavata.apache.org" <dev@airavata.apache.org>
Date: Monday, June 10, 2013 5:20 PM
To: "dev@airavata.apache.org" <dev@airavata.apache.org>
Cc: "dev@oodt.apache.org" <dev@oodt.apache.org>
Subject: Re: Apache Airavata-OODT Integration

>Hi Chris,
>       On configuration, I have get rid of all the configuration files,
>including pge-config.xml. All the required configurations are
>programmatically set.  Configurations such FileManagerServer URL are
>configured in the airavata-server.properties file. I'll update the review
>request with modified details.

Great work!


>       Still I am not quite clear on how to retrieve staged file path
>properly. Currently I am using getStagedFilePath method
>in ApacheAiravataWorkFlowInstanceImpl to regenerate the staged file path.
>While I am going through the OODT code that I have seen method in
>DataTransferer to notify FileManagerServer once transfer is completed. But
>I couldn't see the same for product retrieval.

Example:
http://svn.apache.org/repos/asf/oodt/trunk/pge/src/test/resources/pge-confi
g.xml


Review Board tickets:
https://reviews.apache.org/r/4746/

https://reviews.apache.org/r/5382/


JIRA issue source (in OODT since 0.4):
  https://issues.apache.org/jira/browse/OODT-443


>       As you suggested I'll improve my workflow using Apache Tika. I'd
>like to continue this as an Parallal task. While modifying staging
>implementation based on community feedback, currently I am looking at
>ingesting output back to OODT.

See above for info on file staging. I would strongly encourage you not
to reimplement CAS-PGE in Airavata -- it's pretty functional and expressive
anyways and I would work to figure out how to make Airavata leverage
CAS-PGE.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++



>
>
>
>On Wed, Jun 5, 2013 at 12:11 AM, Mattmann, Chris A (398J) <
>chris.a.mattmann@jpl.nasa.gov> wrote:
>
>> Hi Sanjaya,
>>
>> I think starting out with /bin/ls would be good, maybe like a /bin/ls
>> workflow, and then for each file returned, maybe run Apache Tika and
>> extract its metadata and then pipe that to a file?
>>
>> How about that?
>>
>> Cheers,
>> Chris
>>
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Chris Mattmann, Ph.D.
>> Senior Computer Scientist
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 171-266B, Mailstop: 171-246
>> Email: chris.a.mattmann@nasa.gov
>> WWW:  http://sunset.usc.edu/~mattmann/
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Adjunct Assistant Professor, Computer Science Department
>> University of Southern California, Los Angeles, CA 90089 USA
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: Sanjaya Medonsa <sanjayamrt@gmail.com>
>> Reply-To: "dev@airavata.apache.org" <dev@airavata.apache.org>
>> Date: Tuesday, June 4, 2013 5:31 AM
>> To: "dev@airavata.apache.org" <dev@airavata.apache.org>
>> Cc: "dev@oodt.apache.org" <dev@oodt.apache.org>
>> Subject: Re: Apache Airavata-OODT Integration
>>
>> >Hi Chris,
>> >     Please see my comments below on the two items.
>> >
>> >Configuration : It should be possible to set them programmatically.
>> >Actually I have implemented partly it for file staging information.
>>I'll
>> >work to get rid of the other configuration files.
>> >
>> >Staged File Path : I'll work on the suggested approach, though I am not
>> >fully understand it at the moment. I guess I need to go through bit
>>more
>> >on
>> >CAS-PGE and come back to you on the proposed approach.
>> >
>> >Currently I am testing this by wrapping /bin/ls command as GFac
>>service. I
>> >may need to test this with real workflow. Could you please provide me
>>know
>> >some guidance on better scenario to test this.
>> >
>> >Cheers,
>> >Sanjaya
>> >
>> >
>> >
>> >
>> >On Mon, Jun 3, 2013 at 8:17 PM, Mattmann, Chris A (398J) <
>> >chris.a.mattmann@jpl.nasa.gov> wrote:
>> >
>> >> Hi Sanjaya,
>> >>
>> >> -----Original Message-----
>> >>
>> >> From: Sanjaya Medonsa <sanjayamrt@gmail.com>
>> >> Reply-To: "dev@airavata.apache.org" <dev@airavata.apache.org>
>> >> Date: Thursday, May 30, 2013 5:12 AM
>> >> To: "dev@oodt.apache.org" <dev@oodt.apache.org>,
>> >>"dev@airavata.apache.org"
>> >> <dev@airavata.apache.org>
>> >> Subject: Apache Airavata-OODT Integration
>> >>
>> >> >Hi,
>> >> >     I have worked on the Apache Airavata integration with Apache
>> >>OODT. As
>> >> >a first step, I have implemented integration with Apache OODT file
>> >> >manager component.
>> >>
>> >> Great work!!
>> >>
>> >> Comments below:
>> >>
>> >> >      1. Introduce a new GFac Schema type called OODTProduct which
>> >>takes
>> >> >APache OODT product IDs as input.
>> >> >      2. Implemented new pre GFac Handler by extending Apache OODT
>> >> >PgeTaskInstance to stage the corresponding file into the working
>> >> >directory.
>> >> >      3. Once file is staged, input parameter with OODT product id
>>is
>> >> >replaced with path of the staged file for downstream processing
>> >> >
>> >> >I have tested the implementation with Gfac application which wraps
>> >>/bin/ls
>> >> >command. Application takes product id as input and stage
>>corresponding
>> >> >file
>> >> >into the working directory and /bin/ls is executed against the
>>staged
>> >> >file.
>> >> >Hope this is a valid testing scenario.
>> >> >
>> >> >Concerns
>> >> >- Configurations : I have added new configuration file named and
>> >> >oodt-integration.properties in addition to dynamic_metadata.met and
>> >> >pge-config.xml files used by OODT. But at the moment there is no
>>item
>> >> >configured with the oodt-integration.properties.
>> >>
>> >> You probably only need the pge-config.xml file. Dynamic metadata, and
>> >>the
>> >> task configuration properties can be specified programmatically,
>>right?
>> >>
>> >> >- Staged File Name - With the current implementation of
>> >>PgeTaskInstance it
>> >> >is not possible to retrieve path of the staged file. Due to this
>> >> >limitation, I have query the FileManagerServer with product id and
>> >> >retrieve
>> >> >the file name and computed the file path using information of
>>working
>> >> >directory.
>> >>
>> >> I'm not sure I understand this? If you store and record the Filename,
>> >>and
>> >> FileLocation
>> >> metadata files, then you can easily retrieve the staged file path
>>via a
>> >> SQLquery
>> >> via CAS-PGE by simply setting the FORMAT=('$FileLocation/$Filename')
>>in
>> >> the response.
>> >> Can you comment on this?
>> >>
>> >> >- Currently it is not possible to execute the workflow using Xbaya
>>due
>> >>to
>> >> >validation failure due to new schema type. I have commented out the
>> >> >relevant validation code for testing purpose.
>> >>
>> >> OK, will probably need to work on this.
>> >>
>> >> >
>> >> >Currently I am having an issue with review board client tool and
>>need
>> >>to
>> >> >resolve it to upload the code for review.
>> >>
>> >> I see later that you got this working, so will head over and review
>>that
>> >> now.
>> >>
>> >> Thanks!
>> >>
>> >> Cheers,
>> >> Chris
>> >>
>> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >> Chris Mattmann, Ph.D.
>> >> Senior Computer Scientist
>> >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> >> Office: 171-266B, Mailstop: 171-246
>> >> Email: chris.a.mattmann@nasa.gov
>> >> WWW:  http://sunset.usc.edu/~mattmann/
>> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >> Adjunct Assistant Professor, Computer Science Department
>> >> University of Southern California, Los Angeles, CA 90089 USA
>> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >>
>> >>
>> >>
>> >>
>>
>>


Mime
View raw message