oodt-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Davoodi, Faranak (388J)" <Faranak.Davo...@jpl.nasa.gov>
Subject Re: CAS_PGE's ExternExtractorMetWriter config file
Date Sat, 16 Apr 2011 02:51:19 GMT
Based on the document you have sent me, for the simple python script that
runs like this: kml.python [input][outputmet]. I need to have the config
file like:


<?xml version="1.0" encoding="UTF-8"?>
<cas:externextractor xmlns:cas="http://oodt.jpl.nasa.gov/1.0/cas">
   <exec workingDir="">
      
<extractorBinPath>/usr/local/extractors/mp3extractor/mp3PythonExtractor.py<
/extractorBinPath>
      <args>
         <arg isDataFile="true"/>
      </args>
   </exec>
</cas:externextractor>


Do I have to specify the working directory like the first line: <exec
workingDir="">. Because I don't see that line in the peate's sample file.
Also I see the python extractor has these lines that mines doesn't. Is
this the reason my python script doesn't get run? The python file I have
simply parses the file and generates the output met.


cmd = "java -jar /Users/woollard/Desktop/extractors/mp3extractor/"
cmd += "tika-app-0.5-SNAPSHOT.jar -m "+fullPath+" | awk -F:"
cmd += " 'BEGIN {print \"<cas:metadata xmlns:cas="
cmd += "\\\"http://oodt.jpl.nasa.gov/1.0/cas\\\">\"}"
cmd += " {print \"<keyval><key>\"$1\"</key><val>\"substr($2,2)\""
cmd += "</val></keyval>\"}' > "+fileName+".met"






#!/usr/bin/python

import os
import sys

fullPath = sys.argv[1]
pathElements = fullPath.split("/");
fileName = pathElements[len(pathElements)-1]
fileLocation = fullPath[:(len(fullPath)-len(fileName))]
productType = "MP3"

cmd = "java -jar /Users/woollard/Desktop/extractors/mp3extractor/"
cmd += "tika-app-0.5-SNAPSHOT.jar -m "+fullPath+" | awk -F:"
cmd += " 'BEGIN {print \"<cas:metadata xmlns:cas="
cmd += "\\\"http://oodt.jpl.nasa.gov/1.0/cas\\\">\"}"
cmd += " {print \"<keyval><key>\"$1\"</key><val>\"substr($2,2)\""
cmd += "</val></keyval>\"}' > "+fileName+".met"

os.system(cmd)

f = open(fileName+".met", 'a')
f.write('<keyval><key>ProductType</key><val>'+productType)
f.write('</val></keyval>\n<keyval><key>Filename</key><val>')
f.write(fileName+'</val></keyval>\n'<keyval><key>FileLocation')
f.write('</key><val>'+fileLocation+'</val></keyval>\n')
f.write('</cas:metadata>')
f.close()




On 4/15/11 6:22 PM, "Davoodi, Faranak (388J)"
<Faranak.Davoodi@jpl.nasa.gov> wrote:

>Thanks Brian. The document was actually very helpful.
>
>--Faranak
>
>From: holenoter <holenoter@me.com<mailto:holenoter@me.com>>
>Reply-To: "dev@oodt.apache.org<mailto:dev@oodt.apache.org>"
><dev@oodt.apache.org<mailto:dev@oodt.apache.org>>
>Date: Fri, 15 Apr 2011 14:19:04 -0700
>To: "dev@oodt.apache.org<mailto:dev@oodt.apache.org>"
><dev@oodt.apache.org<mailto:dev@oodt.apache.org>>
>Cc: "dev@oodt.apache.org<mailto:dev@oodt.apache.org>"
><dev@oodt.apache.org<mailto:dev@oodt.apache.org>>
>Subject: Re: CAS_PGE's ExternExtractorMetWriter config file
>
>http://oodt.apache.org/components/maven/metadata/user/basic.html
>
>On Apr 15, 2011, at 02:09 PM, "Davoodi, Faranak (388J)"
><Faranak.Davoodi@jpl.nasa.gov<mailto:Faranak.Davoodi@jpl.nasa.gov>> wrote:
>
>I have a couple of out put products that I am trying to extract extra
>metadata and add them to the final .met files. Here is how I run my files:
>
>Python [someBinDir]/ncdump [PathToPythonExtractor]/extractor1.py
>[PathToOutputProduct]/productName [PathToOutPutMet]
>
>How should I write the extern extractor config file:
>
>
><?xml version="1.0" encoding="UTF-8"?>
>
><cas:externextractor xmlns:cas="http://oodt.jpl.nasa.gov/1.0/cas">
>
><exec metFileExt="tmp.cas">
><extractorBinPath 
>envReplace="true">[PathToPythonExtractor]</extractorBinPath>
><args>
><arg>python</arg>
><arg>[someBinDir]/ncdump</arg>
>
><arg>extractor1.py</arg>
>
><arg isDataFile="true"/>
><arg>-reader</arg>
><arg>Rtp3FileReader</arg>
><arg>--metFile</arg>
><arg>-toFile</arg>
><arg isMetFile="true"/>
><arg>-writer</arg>
><arg>XmlCasWriter</arg>
></args>
></exec>
>
>
></cas:externextractor>
>
>Thanks,
>Faranak


Mime
View raw message