oodt-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cheng, Cecilia S (388K)" <cecilia.s.ch...@jpl.nasa.gov>
Subject Re: Start/Restart/Stop commands for FM, WM, RM
Date Wed, 18 Apr 2012 16:23:39 GMT
Hi Cynthia,

I think the most important point about shutting down the components
gracefully is so that tasks / jobs aren't lost. There are ways to achieve
that even though the 'stop' commands execute a brute 'kill'.

For example, you can pause the RM, so that no more jobs will be sent to
the batch stubs, then wait until all those running jobs are done before
you shut down the RM. Upon restart of the RM, the RM will rebuild its Q
from the state before the shutdown. Please note that these capabilities
are implemented in the branched RM. ACOS has tested the pause capability,
but not the rebuild capability.

You can do something similar to that in the WEngine as well.

-- cecilia

On 4/13/12 12:55 PM, "Wong, Cynthia L (388J)"
<cynthia.l.wong@jpl.nasa.gov> wrote:

>What are the behavior for these servers (FM, WM, RM) when the
>start/restart/stop commands are issued?
>
>For example, when we issue command "fmgr start", it does the following:
>
>Read properties files
>Read policy files
>Connect to database???
>
>When we issue command "fmgr stop", does it wait for the current file
>transfer to complete?
>
>When we issue command "wmgr stop", does it wait for the workflow tasks to
>complete and shut down gracefully?
>
>Is there documentation or javadocs to describe the details about these
>commands?
>
>Thanks,
>Cynthia
>
>--
>Cynthia L. Wong
>Data Management Systems and Technologies
>Jet Propulsion Laboratory
>4800 Oak Grove Drive, M/S  171-264, Pasadena, CA  91109-8099
>Phone:  818/393-2572, Email: Cynthia.L.Wong@jpl.nasa.gov
>


Mime
View raw message