oodt-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mattmann, Chris A (388J)" <chris.a.mattm...@jpl.nasa.gov>
Subject Re: Resource Manager client question
Date Wed, 09 May 2012 16:06:19 GMT
Hey Mike,

Anytime! Would be happy to help more as you guys progress.

Take care and keep rockin' on!

Cheers,
Chris

On May 9, 2012, at 8:42 AM, Iwunze, Michael C (GSFC-4700)[NOAA-JPSS] wrote:

> Thanks Chris, this was good information.
> 
> On 5/8/12 1:48 AM, "Mattmann, Chris A" <chris.a.mattmann@jpl.nasa.gov>
> wrote:
> 
>> Hey Cam,
>> 
>> Thanks, some comments below:
>> 
>> On May 7, 2012, at 8:26 PM, Cameron Goodale wrote:
>> 
>>> Hey Mike and Sheryl,
>>> 
>>> Mike was asking me for some similar advice and I plain ran outta talent on
>>> this topic.  From what I can tell Mike would like to run his python scripts
>>> on Resource Manager without the need for setting up Workflow or PGE.
>>> 
>>> At the time I hadn't really thought through all the configuration files
>>> needed, but having stewed on it I thought I should reply.  Now my current
>>> SnowDS implementation is to have the Workflow Task reference a CAS-PGE
>>> (which contains the execution block for my python program i want to run).
>>> Then the Workflow is merely configured to farm the jobs out to the
>>> Resource Manager.
>>> 
>>> Here is a list of questions that I have started to wonder about with Mike's
>>> help, any answers would be appreciated:
>>> 
>>> 1.  Can Resource Manager + Batchstubs be used without any additional OODT
>>> components?
>>> 
>> 
>> Yep one way to see this in action is to run the
>> org.apache.oodt.cas.resource.tools.JobSubmitter
>> tool by cd'ing into a resource manager deployment (let's assume
>> /usr/local/resmgr/bin) and then
>> running:
>> 
>> java -Djava.ext.dirs=../lib org.apache.oodt.cas.resource.tools.JobSubmitter
>> 
>> Which produces:
>> 
>> JobSubmitter --rUrl <resource mgr url> [options]
>> --file <job file path>
>> [--dir <job file dir path>]
>> 
>> This will let you submit a resource manager XML "job file" looks like this:
>> 
>> http://svn.apache.org/repos/asf/oodt/trunk/resource/src/main/resources/example
>> s/jobs/exJob.xml
>> 
>> Key parameters there are:
>> 
>> Name - the human readable name of the job
>> Id - the id of the job
>> Instance Class - the JobInstance
>> Input Class  - specification for how to read/write input for the job, with
>> properties
>> 
>> That being said, interfacing with the resource manager at this level would be
>> a lot harder
>> than simply running workflows, which yes, is the more developer/user friendly
>> interface
>> for specifying tasks to run, which get turned into jobs in resource manager
>> ville.
>> 
>> 
>>> 2.  Is PGE required to run/wrap non-Java programs so they can run within
>>> Resource Manager?
>> 
>> Well, PGE doesn't directly run in Resource Manager. All workflow tasks are
>> submitted
>> to Resource Manager using the TaskJob, and TaskJobInput constructs:
>> 
>> http://s.apache.org/I6S
>> http://s.apache.org/8F1
>> 
>> 
>>> 
>>> Closing comments to Mike:
>>> 
>>>   If you are planning to use OODT for data management, it
>>> is initially very tempting to only setup and configure the minimal set of
>>> components because you will feel productive and it feels like progress is
>>> being made.  Trust me I know since I was in your shoes about 6 months ago
>>> when trying to get some image processing IDL code to run and I bably needed
>>> to see progress (notice I didn't use the works "make progress").  Because I
>>> wanted to use (what I thought was) the "easier" solution I ended up
>>> hardcoding paths to resources my python code needed in the code instead of
>>> passing the parameters into the code in the first place.  This worked
>>> reasonably well as long as everything stayed the same....but then it didn't
>>> so I had to re-visit my "easier" setup and fix it.
>>>   Recently I have been working to undo my mistakes and python has been
>>> very forgiving, but the best part was that  all the strange and mystic
>>> Workflow setups and PGEConfig.xml files actually started to make a whole
>>> lot more sense.  I am now able to configure and stand up a complete
>>> workflow config, then jump into PGEConfig and get the input parameters to
>>> my python code.  This means if the input files i need to process changes I
>>> don't need to change my python code, instead I can merely pass in a
>>> different set of parameters into the workflow and they will persist to my
>>> wrapped python.
>>>   In short I know that combing through all the xml config is tough,
>>> especially when things are not working as quickly as you would like.  I
>>> understand how defeated and frustrating it can be to have the component
>>> fail and just feel lost, not knowing what is causing the problem.  I know
>>> the documentation isn't perfect and sometimes it is missing altogether, but
>>> the people that are on this list will bend over backwards to help you
>>> understand (some will even share their config files with line-by-line
>>> comments included at no extra charge ;)
>>> 
>>> Thank you Sheryl for being awesome and helpful (you always are).  Mike keep
>>> the questions coming and I will be sure to add in my $0.02 when I am able
>>> to.
>> 
>> +1.
>> 
>> Cheers,
>> Chris
>> 
>>> 
>>> On Mon, May 7, 2012 at 5:09 PM, Sheryl John <sheryljj@gmail.com> wrote:
>>> 
>>>> Hi Mike,
>>>> 
>>>> Yup, you can run your python scripts, java programs etc. from CAS-PGE which
>>>> is used with the Workflow Manager. Check out this cas-pge guide [1] and the
>>>> other wiki pages related to workflow.
>>>> 
>>>> You can use Resource Manager to run tasks sent from the Workflow Manager.
>>>> I've recently started testing this but there are others on the list who can
>>>> guide you more on the Resource Manager.
>>>> 
>>>> HTH!
>>>> 
>>>> Sheryl
>>>> 
>>>> [1] https://cwiki.apache.org/OODT/cas-pge-learn-by-example.html
>>>> 
>>>> 
>>>> On Mon, May 7, 2012 at 3:43 PM, Iwunze, Michael C (GSFC-4700)[NOAA-JPSS]
<
>>>> michael.iwunze@nasa.gov> wrote:
>>>> 
>>>>> 
>>>>> I have two questions, I am able to run the Resource Manager with no
>>>>> issues. I have some python scripts and possibly some other programs 
I
>>>>> would like to run using the Resource Manager. From what I know so far
I
>>>>> believe the cas-pge component needs to be used in conjunction with the
>>>>> Resource Manager and is used as a wrapper program for running my scripts.
>>>>> Can someone give me more information on how this can be accomplished
or
>>>> are
>>>>> there any examples to view?
>>>>> 
>>>>> I would also like to be able to utilize the Job Scheduler, Monitor and
>>>>> Job queue classes that are part of the Resource Manager. I can't find
any
>>>>> examples of how they are used anywhere. And if examples do exist can
>>>>> someone point me in the right direction or give me more information on
>>>> this?
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> Mike
>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> -Sheryl
>>>> 
>> 
>> 
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Chris Mattmann, Ph.D.
>> Senior Computer Scientist
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 171-266B, Mailstop: 171-246
>> Email: chris.a.mattmann@nasa.gov
>> WWW:   http://sunset.usc.edu/~mattmann/
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Adjunct Assistant Professor, Computer Science Department
>> University of Southern California, Los Angeles, CA 90089 USA
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> 
> 


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Mime
View raw message