airavata-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mattmann, Chris A (398J)" <chris.a.mattm...@jpl.nasa.gov>
Subject Re: Apache Airavata-OODT Integration
Date Tue, 04 Jun 2013 18:41:07 GMT
Hi Sanjaya,

I think starting out with /bin/ls would be good, maybe like a /bin/ls
workflow, and then for each file returned, maybe run Apache Tika and
extract its metadata and then pipe that to a file?

How about that?

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Sanjaya Medonsa <sanjayamrt@gmail.com>
Reply-To: "dev@airavata.apache.org" <dev@airavata.apache.org>
Date: Tuesday, June 4, 2013 5:31 AM
To: "dev@airavata.apache.org" <dev@airavata.apache.org>
Cc: "dev@oodt.apache.org" <dev@oodt.apache.org>
Subject: Re: Apache Airavata-OODT Integration

>Hi Chris,
>     Please see my comments below on the two items.
>
>Configuration : It should be possible to set them programmatically.
>Actually I have implemented partly it for file staging information. I'll
>work to get rid of the other configuration files.
>
>Staged File Path : I'll work on the suggested approach, though I am not
>fully understand it at the moment. I guess I need to go through bit more
>on
>CAS-PGE and come back to you on the proposed approach.
>
>Currently I am testing this by wrapping /bin/ls command as GFac service. I
>may need to test this with real workflow. Could you please provide me know
>some guidance on better scenario to test this.
>
>Cheers,
>Sanjaya
>
>
>
>
>On Mon, Jun 3, 2013 at 8:17 PM, Mattmann, Chris A (398J) <
>chris.a.mattmann@jpl.nasa.gov> wrote:
>
>> Hi Sanjaya,
>>
>> -----Original Message-----
>>
>> From: Sanjaya Medonsa <sanjayamrt@gmail.com>
>> Reply-To: "dev@airavata.apache.org" <dev@airavata.apache.org>
>> Date: Thursday, May 30, 2013 5:12 AM
>> To: "dev@oodt.apache.org" <dev@oodt.apache.org>,
>>"dev@airavata.apache.org"
>> <dev@airavata.apache.org>
>> Subject: Apache Airavata-OODT Integration
>>
>> >Hi,
>> >     I have worked on the Apache Airavata integration with Apache
>>OODT. As
>> >a first step, I have implemented integration with Apache OODT file
>> >manager component.
>>
>> Great work!!
>>
>> Comments below:
>>
>> >      1. Introduce a new GFac Schema type called OODTProduct which
>>takes
>> >APache OODT product IDs as input.
>> >      2. Implemented new pre GFac Handler by extending Apache OODT
>> >PgeTaskInstance to stage the corresponding file into the working
>> >directory.
>> >      3. Once file is staged, input parameter with OODT product id is
>> >replaced with path of the staged file for downstream processing
>> >
>> >I have tested the implementation with Gfac application which wraps
>>/bin/ls
>> >command. Application takes product id as input and stage corresponding
>> >file
>> >into the working directory and /bin/ls is executed against the staged
>> >file.
>> >Hope this is a valid testing scenario.
>> >
>> >Concerns
>> >- Configurations : I have added new configuration file named and
>> >oodt-integration.properties in addition to dynamic_metadata.met and
>> >pge-config.xml files used by OODT. But at the moment there is no item
>> >configured with the oodt-integration.properties.
>>
>> You probably only need the pge-config.xml file. Dynamic metadata, and
>>the
>> task configuration properties can be specified programmatically, right?
>>
>> >- Staged File Name - With the current implementation of
>>PgeTaskInstance it
>> >is not possible to retrieve path of the staged file. Due to this
>> >limitation, I have query the FileManagerServer with product id and
>> >retrieve
>> >the file name and computed the file path using information of working
>> >directory.
>>
>> I'm not sure I understand this? If you store and record the Filename,
>>and
>> FileLocation
>> metadata files, then you can easily retrieve the staged file path via a
>> SQLquery
>> via CAS-PGE by simply setting the FORMAT=('$FileLocation/$Filename') in
>> the response.
>> Can you comment on this?
>>
>> >- Currently it is not possible to execute the workflow using Xbaya due
>>to
>> >validation failure due to new schema type. I have commented out the
>> >relevant validation code for testing purpose.
>>
>> OK, will probably need to work on this.
>>
>> >
>> >Currently I am having an issue with review board client tool and need
>>to
>> >resolve it to upload the code for review.
>>
>> I see later that you got this working, so will head over and review that
>> now.
>>
>> Thanks!
>>
>> Cheers,
>> Chris
>>
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Chris Mattmann, Ph.D.
>> Senior Computer Scientist
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 171-266B, Mailstop: 171-246
>> Email: chris.a.mattmann@nasa.gov
>> WWW:  http://sunset.usc.edu/~mattmann/
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Adjunct Assistant Professor, Computer Science Department
>> University of Southern California, Los Angeles, CA 90089 USA
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>
>>
>>
>>


Mime
View raw message