oodt-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mattmann, Chris A (388J)" <chris.a.mattm...@jpl.nasa.gov>
Subject Re: OODT Workflow Wiki
Date Tue, 10 Apr 2012 14:08:28 GMT
Hi Mike,

On Apr 9, 2012, at 9:12 AM, Cayanan, Michael D (388J) wrote:

> Hey Chris,
> Comments are below.
>> "At the time of this writing, jobs that cannot be added to the queue
>> disappear...."
>> I think we should be more clear than "disappear". They don't disappear.
>> The 
>> Scheduler will try and send a Job to the BatchMgr, and if there is an
>> exception,
>> it tries to re-queue the Job back onto the JobStack. If it's unable to do
>> that, then
>> there is an issue, but it at the very least tries to re-queue the job if
>> there was an
>> issue. 
> The reason this blurb was put into the wiki was because when Gabe and I
> were looking through the Resource Manager code, this is what looks to be
> happening. Check out the piece of code that tries to add a job:

Reaching Max queue size is different than saying that jobs that cannot be
added to the queue disappear. I think we should explicitly state:

"At the time of this writing, when then queue has reached the max queue 
size, a message is logged by the Scheduler saying there is a Job Queue
Exception adding a job to the queue, and then the Job is dropped."

I think that's more accurate based on your code walk. I was thinking based on
your above message that you were talking about Jobs that couldn't be
Scheduled for whatever reason (e.g., the Batch Mgr being down, or a
Batch Stub being down) in which case they are re-queued.


Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA

View raw message