oodt-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mallder, Valerie" <Valerie.Mall...@jhuapl.edu>
Subject RE: what is batch stub? Is it necessary?
Date Wed, 08 Oct 2014 23:24:09 GMT
Hi Rishi,

Thank you very much for pointing me to your working example. This is very helpful.  My pgeConfig
looks very similar to yours.  So, I commented out the resource manager like you suggested
and tried running again without the resource manager. And my problem still exists. The problem
is that the ExternScriptTaskInstance is unable to recognize the command line arguments that
I want to pass to the crawler_launcher script. Could you send me a link to your tasks.xml
file? I'm curious as to how you defined your task.  My pgeConfig and tasks.xml are below.

Thanks!
Val


<?xml version="1.0" encoding="UTF-8"?>
<pgeConfig>

   <!-- How to run the PGE -->
   <exe dir="[JobDir]" shell="/bin/sh" envReplace="true">
        <cmd>[CRAWLER_HOME]/bin/crawler_launcher --operation --launchAutoCrawler \
        --filemgrUrl [FILEMGR_URL] \
        --clientTransferer org.apache.oodt.cas.filemgr.datatransfer.LocalDataTransferFactory
\
        --productPath [JobInputDir] \
        --mimeExtractorRepo [OODT_HOME]/extensions/policy/mime-extractor-map.xml \
        --actionIds MoveFileToLevel0Dir</cmd>
   </exe>

   <!-- Files to ingest -->
   <output/>
   </output>

<!-- Custom metadata to add to output files -->
   <customMetadata>
      <metadata key="JobDir" val="[OODT_HOME]"/>
      <metadata key="JobInputDir" val="[FEI_DROP_DIR]"/>
      <metadata key="JobOutputDir" val="[JobDir]/data/pge/jobs"/>
      <metadata key="JobLogDir" val="[JobDir]/data/pge/logs"/>
   </customMetadata>

</pgeConfig>



<!-- tasks.xml **************************************************-->

<cas:tasks xmlns:cas="http://oodt.jpl.nasa.gov/1.0/cas">

   <task id="urn:oodt:crawlerLauncherId" name="crawlerLauncherName" class="org.apache.oodt.cas.workflow.examples.ExternScriptTaskInstance">
      <conditions/>  <!-- There are no pre execution conditions right now -->
      <configuration>

          <property name="ShellType" value="/bin/sh" />
          <property name="PathToScript" value="[CRAWLER_HOME]/bin/crawler_launcher" envReplace="true"
/>

          <property name="PGETask_Name" value="crawler_launcher PGE Task"/>
          <property name="PGETask_ConfigFilePath" value="[OODT_HOME]/extensions/config/crawler-pge-config.xml"
envReplace="true" />
      </configuration>
   </task>

</cas:tasks>

Valerie A. Mallder
New Horizons Deputy Mission System Engineer
Johns Hopkins University/Applied Physics Laboratory


> -----Original Message-----
> From: Verma, Rishi (398J) [mailto:Rishi.Verma@jpl.nasa.gov]
> Sent: Wednesday, October 08, 2014 6:01 PM
> To: dev@oodt.apache.org
> Subject: Re: what is batch stub? Is it necessary?
>
> Hi Valerie,
>
> >>>> All I am trying to do is run "crawler_launcher" as a workflow task
> >>>> in the CAS PGE environment.
>
> Interesting. I have a working example here [1] you can look at that does this exact
> thing.
>
> >>>> So, if "batchstub" is necessary in this scenario, pleast tell me
> >>>> what it is, why it is necessary, and how to run it (please provide
> >>>> exact syntax to put in my startup shell script, because I would
> >>>> never be able to figure it out for myself and I don't want to have
> >>>> to bother everyone again.)
>
> Batchstub is only necessary if your Workflow Manger is sending jobs to Resource
> Manager for execution (where the default execution is to run the job in something
> called a ?batch stub? executable). Think of batch stubs as a small wrapper
> program that takes a bundle of executable instructions from Resource Manager,
> and executes them in a shell environment within a given remote (or local) machine.
>
> Here?s my suggestion:
> 1. Like Paul suggested, go to $OODT_HOME/resmgr/bin, and execute the
> following command (it?ll start a batch stub in a terminal on port 2001):
>     > ./batch_stub 2001
>
> If the above step doesn?t fix your problem, you can also try having Workflow
> Manager NOT send jobs to Resource Manager for execution, and instead execute
> jobs locally through Workflow Manager itself (on localhost only!). To disable job
> transfer to Resource Manger, you?ll need to modify the Workflow Manager
> properties file ($OODT_HOME/wmgr/etc/workflow.properties), and specifically
> comment out the ?org.apache.oodt.cas.workflow.engine.resourcemgr.url? line.
> I?ve done this in my example code below, see [2] for an exact example of this.
> After modifying workflow.properties, make sure to restart workflow manager
> ($OODT_HOME/wmgr/bin/wmgr stop   followed by $OODT_HOME/wmgr/bin/wmgr
> start).
>
> Thanks,
> Rishi
>
> [1] https://github.com/riverma/xdata-jpl-netscan/blob/master/oodt-
> netscan/pge/src/main/resources/policy/netscan-getipv4entriesrandomsample.xml
> [2] https://github.com/riverma/xdata-jpl-netscan/blob/master/oodt-
> netscan/workflow/src/main/resources/etc/workflow.properties
>
> On Oct 8, 2014, at 2:31 PM, Ramirez, Paul M (398J)
> <paul.m.ramirez@jpl.nasa.gov> wrote:
>
> > Valerie,
> >
> > I would have thought it would have just not used a batch stub by default. That
> said if you go into the $OODT_HOME/resmgr/bin there should be a script to start a
> batch stub. Right now on my phone I forget the name of the script but if you more
> the file you will see the Java class name that corresponds to below. You should
> specify a port when you run the script which from the looks of the output below
> should be 2001.
> >
> > HTH,
> > Paul R
> >
> > Sent from my iPhone
> >
> >> On Oct 8, 2014, at 2:04 PM, Mallder, Valerie <Valerie.Mallder@jhuapl.edu>
> wrote:
> >>
> >> Well then, I'm proud to be a member :)  (I think .... )
> >>
> >>
> >> Valerie A. Mallder
> >> New Horizons Deputy Mission System Engineer Johns Hopkins
> >> University/Applied Physics Laboratory
> >>
> >>
> >>> -----Original Message-----
> >>> From: Bruce Barkstrom [mailto:brbarkstrom@gmail.com]
> >>> Sent: Wednesday, October 08, 2014 4:54 PM
> >>> To: dev@oodt.apache.org
> >>> Subject: Re: what is batch stub? Is it necessary?
> >>>
> >>> You have every right to bother everyone.
> >>> You won't get what you need unless you do.
> >>>
> >>> You get one honorary membership in the Society of General Agitators
> >>> - at the rank of Major Agitator.
> >>>
> >>> Bruce B.
> >>>
> >>> On Wed, Oct 8, 2014 at 4:49 PM, Mallder, Valerie
> >>> <Valerie.Mallder@jhuapl.edu
> >>>> wrote:
> >>>
> >>>> Hello,
> >>>>
> >>>> I am still having trouble getting my CAS PGE crawler task to run
> >>>> due to
> >>>> http://localhost:2001 being "down". I have spent the last 2 days
> >>>> tracing through the resource manager code and tracked this down to
> >>>> line 146 of LRUScheduler where the XmlRpcBatchMgr is failing to
> >>>> execute the task remotely, because on line 75 of
> >>>> XmlRpcBatchMgrProxy (that was instantiated by XmlRpcBatchMgr on its
> >>>> line 74) is trying to call "isAlive" on the webservice named
> >>>> "batchstub" which, to my knowledge, is not running because I have not
done
> anything explicitly to run it.
> >>>>
> >>>> All I am trying to do is run "crawler_launcher" as a workflow task
> >>>> in the CAS PGE environment.  I had it running perfectly before I
> >>>> started trying to make it run as part of a workflow.  I really miss
> >>>> my crawler and really want it to run again L
> >>>>
> >>>> So, if "batchstub" is necessary in this scenario, pleast tell me
> >>>> what it is, why it is necessary, and how to run it (please provide
> >>>> exact syntax to put in my startup shell script, because I would
> >>>> never be able to figure it out for myself and I don't want to have
> >>>> to bother everyone again.)
> >>>>
> >>>> Thanks so much!
> >>>>
> >>>> Val
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> Valerie A. Mallder
> >>>>
> >>>> New Horizons Deputy Mission System Engineer The Johns Hopkins
> >>>> University/Applied Physics Laboratory
> >>>> 11100 Johns Hopkins Rd (MS 23-282), Laurel, MD 20723
> >>>> 240-228-7846 (Office) 410-504-2233 (Blackberry)
> >>>>
> >>>>
>
> ---
> Rishi Verma
> NASA Jet Propulsion Laboratory
> California Institute of Technology


Mime
View raw message