Return-Path: X-Original-To: apmail-oodt-dev-archive@www.apache.org Delivered-To: apmail-oodt-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 43EBC9632 for ; Tue, 10 Apr 2012 14:09:00 +0000 (UTC) Received: (qmail 1648 invoked by uid 500); 10 Apr 2012 14:09:00 -0000 Delivered-To: apmail-oodt-dev-archive@oodt.apache.org Received: (qmail 1584 invoked by uid 500); 10 Apr 2012 14:08:59 -0000 Mailing-List: contact user-help@oodt.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@oodt.apache.org Delivered-To: mailing list user@oodt.apache.org Received: (qmail 1557 invoked by uid 99); 10 Apr 2012 14:08:59 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Apr 2012 14:08:59 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [128.149.139.106] (HELO mail.jpl.nasa.gov) (128.149.139.106) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Apr 2012 14:08:51 +0000 Received: from mail.jpl.nasa.gov (ap-ehub-sp01.jpl.nasa.gov [128.149.137.148]) by smtp.jpl.nasa.gov (Sentrion-MTA-4.2.2/Sentrion-MTA-4.2.2) with ESMTP id q3AE8SHm011744 (using TLSv1/SSLv3 with cipher AES128-SHA (128 bits) verified NO) for ; Tue, 10 Apr 2012 07:08:29 -0700 Received: from AP-EMBX-SP40.RES.AD.JPL ([169.254.7.170]) by ap-ehub-sp01.RES.AD.JPL ([169.254.3.238]) with mapi id 14.01.0355.002; Tue, 10 Apr 2012 07:08:28 -0700 From: "Mattmann, Chris A (388J)" To: "" Subject: Re: OODT Workflow Wiki Thread-Topic: OODT Workflow Wiki Thread-Index: Ac0Ucy4GVrv0H/yJSy26HXK2GRwZuQCMwpeAAC32jYA= Date: Tue, 10 Apr 2012 14:08:28 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [128.149.137.114] Content-Type: text/plain; charset="us-ascii" Content-ID: Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Source-Sender: chris.a.mattmann@jpl.nasa.gov X-AUTH: Authorized Hi Mike, On Apr 9, 2012, at 9:12 AM, Cayanan, Michael D (388J) wrote: > Hey Chris, >=20 > Comments are below. >>=20 >> "At the time of this writing, jobs that cannot be added to the queue >> disappear...." >>=20 >> I think we should be more clear than "disappear". They don't disappear. >> The=20 >> Scheduler will try and send a Job to the BatchMgr, and if there is an >> exception, >> it tries to re-queue the Job back onto the JobStack. If it's unable to d= o >> that, then >> there is an issue, but it at the very least tries to re-queue the job if >> there was an >> issue.=20 >=20 > The reason this blurb was put into the wiki was because when Gabe and I > were looking through the Resource Manager code, this is what looks to be > happening. Check out the piece of code that tries to add a job: Reaching Max queue size is different than saying that jobs that cannot be added to the queue disappear. I think we should explicitly state: "At the time of this writing, when then queue has reached the max queue=20 size, a message is logged by the Scheduler saying there is a Job Queue Exception adding a job to the queue, and then the Job is dropped." I think that's more accurate based on your code walk. I was thinking based = on your above message that you were talking about Jobs that couldn't be Scheduled for whatever reason (e.g., the Batch Mgr being down, or a Batch Stub being down) in which case they are re-queued. Cheers, Chris ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattmann@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++