Return-Path: X-Original-To: apmail-oodt-dev-archive@www.apache.org Delivered-To: apmail-oodt-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id ADF147EA1 for ; Wed, 14 Sep 2011 23:40:13 +0000 (UTC) Received: (qmail 75785 invoked by uid 500); 14 Sep 2011 23:40:13 -0000 Delivered-To: apmail-oodt-dev-archive@oodt.apache.org Received: (qmail 75708 invoked by uid 500); 14 Sep 2011 23:40:13 -0000 Mailing-List: contact user-help@oodt.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@oodt.apache.org Delivered-To: mailing list user@oodt.apache.org Received: (qmail 75697 invoked by uid 99); 14 Sep 2011 23:40:12 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 Sep 2011 23:40:12 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of sheryljj@gmail.com designates 209.85.214.43 as permitted sender) Received: from [209.85.214.43] (HELO mail-bw0-f43.google.com) (209.85.214.43) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 Sep 2011 23:40:08 +0000 Received: by bkas6 with SMTP id s6so3045081bka.16 for ; Wed, 14 Sep 2011 16:39:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; bh=m6S2Ssm95UJgA/yHM0+MzVH6hs/5uX1trS6Gx5tgmKc=; b=RlrARHykN5B5RoC2olXC9ZWM8J+bRZPd3rIo17Mj5WDDf5j6bsuxEaF4recjCk5GJS zJGAZKPZFiYbOdruGStgZDyihIvBLcVapKxS4ObcKrqrlIJlGbmcCsUn9Nqea1XuZhaS LM7stLkARIWIJF3G7EEwYKvNeBjenqwKP8hNA= MIME-Version: 1.0 Received: by 10.204.152.198 with SMTP id h6mr234461bkw.234.1316043586647; Wed, 14 Sep 2011 16:39:46 -0700 (PDT) Received: by 10.204.140.66 with HTTP; Wed, 14 Sep 2011 16:39:46 -0700 (PDT) Date: Wed, 14 Sep 2011 16:39:46 -0700 Message-ID: Subject: PGETask Workflow Metadata From: Sheryl John To: user@oodt.apache.org, chla-dev@jpl.nasa.gov Content-Type: multipart/alternative; boundary=0015175cd98a7db89f04acef4671 --0015175cd98a7db89f04acef4671 Content-Type: text/plain; charset=ISO-8859-1 Hi, I have defined some key-val pairs for a file( say Output.csv) in a metout-config.xml for my PGETask Workflow. However, after executing the workflow, the met-config.xml is not creating a Output.csv.cas file. I want to be able to use the above keys/metadata later on in an SQL-like query from the pgeconfig file. For example, if I've defined 'RecordID' as a key in the metout-config.xml, I would want to use this metadata in the following query: SQL(FORMAT='$FileLocation/$Filename'){ SELECT FileLocation,Filename,ISMTable,*RecordID* FROM ISMRawData WHERE ISMTable = 'Chartevents' AND *RecordID* = "PID"} The others keys included in the query above are elements and product-types that were defined during ingestion in the File Manager. So, right now, the task fails to parse the above query when I run the workflow. Is this because metout-config is not creating the Output.cas file? And, whats the best way to specify metadata files for a group of files or for a folder? Thanks, On Fri, Sep 9, 2011 at 10:07 PM, Sheryl John wrote: > Oh Ok. So, it adds workflow metadata to the existing metadata. > As you suggest, I will continue querying from the PGE Config for pulling > specific files. > > I'll be glad to contribute to the merging of the PGETask Workflow > Pre-condition to the trunk 0.4. > > Thanks Chris! > > > On Fri, Sep 9, 2011 at 8:12 PM, Mattmann, Chris A (388J) < > chris.a.mattmann@jpl.nasa.gov> wrote: > >> Hey Sheryl, >> >> On Sep 9, 2011, at 6:10 PM, Sheryl John wrote: >> >> > Hi, >> > >> > I have questions regarding the Workflow Manager, particularly the Met >> File writers and querying data from the File Manager. >> > >> > 1) For files that are required for a PGE task workflow, do I specify the >> metadata key-value pairs of the file ( e.g Key="TableName" Val="Chartevents" >> ) in the metout-config.xml ? >> >> Basically metout-config.xml is for the specific >> MetadataListPcsMetFileWriter [1] instance configured in your CAS-PGE >> pge-config.xml file. >> This file defines metadata to pull out of the workflow context metadata, >> and to write (and merge) with the rest of the file metadata for the product >> you are about to ingest. So, putting a key in metout-config.xml is like >> saying "I'd like to copy this workflow context metadata to the file product >> metadata". >> >> > And, how does the Workflow mgr use the values for the next step/task in >> the pipeline? >> >> See above. >> >> > >> > >> > 2) To query the File Manager, should I use the Query building option >> available in the PGETask Workflow Pre-Condition? >> > I have previously used the SQL-like query in a Pge Config file to pull >> ingested files, but after reading the Workflow 2 Guide, I was wondering if >> the File Mgr querying should be done in the pre-conditon of a task. So, this >> would be a pre-condition for checking if input files are available before a >> task begins. Is this right? >> >> To use the condition based querying from PGETaskWorkflowCondition, you'd >> need to use wengine-branch. >> >> Rather than do so, I'd recommend just wiring the SQL(... query into your >> CAS PGE config and doing the querying there. It'll be >> simpler, and I have plans to merge in PGETask Workflow Pre-Condition later >> into the trunk for 0.4. If that's something you'd like >> to help with, I'd love to see a JIRA issue and a patch and I'll happily >> shepherd it in. >> >> Thanks Sheryl! >> >> Cheers, >> Chris >> >> [1] http://s.apache.org/bW4 >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Chris Mattmann, Ph.D. >> Senior Computer Scientist >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >> Office: 171-266B, Mailstop: 171-246 >> Email: chris.a.mattmann@nasa.gov >> WWW: http://sunset.usc.edu/~mattmann/ >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Adjunct Assistant Professor, Computer Science Department >> University of Southern California, Los Angeles, CA 90089 USA >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> >> > > > -- > -Sheryl > -- -Sheryl --0015175cd98a7db89f04acef4671 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi,

I have defined some key-val pairs for a file( say Ou= tput.csv) in a metout-config.xml for my PGETask Workflow. However, after ex= ecuting the workflow, the met-config.xml is not creating a Output.csv.cas f= ile.=A0

I want to be able to use the above keys/metadata later = on in an SQL-like query from the pgeconfig file.=A0
For example, = if I've defined 'RecordID' as a key in the metout-config.xml, I= would want to use this metadata in the following query:

SQL(FORMAT=3D'$FileLocation/$Filename'){ S= ELECT FileLocation,Filename,ISMTable,RecordID FROM ISMRawData WHERE = ISMTable =3D 'Chartevents'=A0 AND RecordID =3D "PID&quo= t;}

The others keys included in the query above are element= s and product-types that were defined during ingestion in the File Manager.=

So, right now, the task fails to parse the above = query when I run the workflow. Is this because metout-config is not creatin= g the Output.cas file? =A0And, whats the best way to specify metadata files= for a group of files or for a folder?

Thanks,

On Fri, Sep= 9, 2011 at 10:07 PM, Sheryl John <sheryljj@gmail.com> wrot= e:
Oh Ok. So, it adds workflow metadata to the = existing metadata.=A0
As you suggest, I will continue querying from the= PGE Config for pulling specific files.

I'll be glad to contribute to the merging of the PG= ETask Workflow Pre-condition to the trunk 0.4.

Thanks Chris!


On Fri, Sep 9, 2011 at 8:12 PM, Mattmann, Chris A (388J) <chris.a.mattmann@jpl.nasa.gov> wrote:
Hey Sheryl,

On Sep 9, 2011, at 6:10 PM, Sheryl John wrote:

> Hi,
>
> I have questions regarding the Workflow Manager, particularly the Met = File writers and querying data from the File Manager.
>
> 1) For files that are required for a PGE task workflow, do I specify t= he metadata key-value pairs of the file ( e.g Key=3D"TableName" V= al=3D"Chartevents" ) in the metout-config.xml ?

Basically metout-config.xml is for the specific MetadataListPcsMetFil= eWriter [1] instance configured in your CAS-PGE pge-config.xml file.
This file defines metadata to pull out of the workflow context metadata, an= d to write (and merge) with the rest of the file metadata for the product you are about to ingest. So, putting a key in metout-config.xml is like say= ing "I'd like to copy this workflow context metadata to the file p= roduct
metadata".

> And, how does the Workflow mgr use the values for the next step/task i= n the pipeline?

See above.

>
>
> 2) To query the File Manager, should I use the Query building option a= vailable in the PGETask Workflow Pre-Condition?
> =A0 I have previously used the SQL-like query in a Pge Config file to = pull ingested files, but after reading the Workflow 2 Guide, I was wonderin= g if the File Mgr querying should be done in the pre-conditon of a task. So= , this would be a pre-condition for checking if input files are available b= efore a task begins. Is this right?

To use the condition based querying from PGETaskWorkflowCondition, yo= u'd need to use wengine-branch.

Rather than do so, I'd recommend just wiring the SQL(... query into you= r CAS PGE config and doing the querying there. It'll be
simpler, and I have plans to merge in PGETask Workflow Pre-Condition later = into the trunk for 0.4. If that's something you'd like
to help with, I'd love to see a JIRA issue and a patch and I'll hap= pily shepherd it in.

Thanks Sheryl!

Cheers,
Chris

[1] http://s.apache.o= rg/bW4

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris= .a.mattmann@nasa.gov
WWW: =A0 htt= p://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++




--
-Sheryl



--
-Sheryl
--0015175cd98a7db89f04acef4671--