Return-Path: X-Original-To: apmail-uima-user-archive@www.apache.org Delivered-To: apmail-uima-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B22FF118F9 for ; Wed, 9 Jul 2014 09:55:49 +0000 (UTC) Received: (qmail 17300 invoked by uid 500); 9 Jul 2014 09:55:43 -0000 Delivered-To: apmail-uima-user-archive@uima.apache.org Received: (qmail 17254 invoked by uid 500); 9 Jul 2014 09:55:43 -0000 Mailing-List: contact user-help@uima.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@uima.apache.org Delivered-To: mailing list user@uima.apache.org Received: (qmail 17241 invoked by uid 99); 9 Jul 2014 09:55:42 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Jul 2014 09:55:42 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of lou.degenaro@gmail.com designates 209.85.128.180 as permitted sender) Received: from [209.85.128.180] (HELO mail-ve0-f180.google.com) (209.85.128.180) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Jul 2014 09:55:39 +0000 Received: by mail-ve0-f180.google.com with SMTP id jw12so6862929veb.39 for ; Wed, 09 Jul 2014 02:55:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=EgF3LURW8X/pVA3ynfOuNi7P9Cg8hjAIPhH5JGYVJcs=; b=xqQCzUqgzoGGgQKUEk/O8fv+hF42mCveeuE1PBVRGivjnM2J2dL+6qAfaq6R31eTgx h9fZxE0TjpZbIAr62e2JfkcgfLOxTHEzt5SszGUJaUnwYMr7nnfbFq7XJUBcdXn2Loq4 5/CTtv0uDM945ZQA0rS82TyrNCvGUVLlm/EPDohxZCIdnYVQi1rBpROewgd/6FGXGbxl F+R314FbcRXBRCXIJPj1b0uqpPWY0Qy8xOnzrmCN7p2AUUV/FGBNuAshyM05fnu8ryGU Prym5zQZUq7/MPKyuAUO4D26vJcn6w+O0PugkxUXqt5PZffW8EQy9j3qq/byqDCwiv9k PuVw== MIME-Version: 1.0 X-Received: by 10.58.210.168 with SMTP id mv8mr38911349vec.12.1404899714002; Wed, 09 Jul 2014 02:55:14 -0700 (PDT) Received: by 10.52.108.130 with HTTP; Wed, 9 Jul 2014 02:55:13 -0700 (PDT) In-Reply-To: <53BCD608.9090103@orkash.com> References: <53BA6251.8010409@orkash.com> <53BBC489.80600@gmail.com> <53BCD608.9090103@orkash.com> Date: Wed, 9 Jul 2014 05:55:13 -0400 Message-ID: Subject: Re: Infinte initialization of a process even after restarting DUCC From: Lou DeGenaro To: user@uima.apache.org Content-Type: multipart/alternative; boundary=047d7bea34c665f32304fdbfb41b X-Virus-Checked: Checked by ClamAV on apache.org --047d7bea34c665f32304fdbfb41b Content-Type: text/plain; charset=UTF-8 This looks like data from the Job Details page showing job-processes. For that job, what does the Jobs page look like? Is the state Completed? If so, the job is not running and the information on the Job Details page is spurious. Said another way, if check_ducc -k is working then all your DUCC daemons were stopped and you had to re-start DUCC. Upon re-start (presuming you used the default "warm" start) all previous running jobs are marked as Completed. If the the job itself is Completed yet the job-processes continue to show an active state then this is erroneous information...and I assert that the job-processes are not really running. The fact that the Job Details page reports otherwise is a bug that needs to be fixed (if not already fixed in the next release). Lou. Lou. On Wed, Jul 9, 2014 at 1:41 AM, reshu.agarwal wrote: > On 07/08/2014 03:44 PM, Jim Challenger wrote: > >> I like to stop ducc by issuing check_ducc -k a few times after stop_ducc. >> This sends kill -9 to any ducc components that couldn't stop for some >> reason. Unfortunately it can't kill zombies but once you have done >> check_ducc -k it should not matter. As Lou mentioned, the 1.1.0 release >> will make some of this situation better but I've seen intense analytics >> leave hardware and software on hosts in states that only kill -9 can >> effectively handle. >> > Dear Jim and Lou, > > I have tried all check_ducc -k and ./stop_ducc but the Job is showing > incremented status till now as given below: > > Id Log Size Host > Name PID State > Scheduler Reason > Scheduler > or extraordinary status State > Agent Reason > Agent Exit Time > Init Time > Run Time > GC PgIn Swap %CPU RSS Time > Avg Time > Max Time > Min Done Error Dis- > patch Retry Pre- > empt JConsole > URL > 0 jd.out.log ducc-servlet/log-data?fname=/disk2/ducc/ducc/logs/30696/jd.out.log> > 0.14 S144 8408 Deallocated Voluntary Stopped > ExitCode=0 00 2:15:59:40 00 57 0.0 > 5.0 0.2 6 16 1 14 0 0 0 0 > 0 jd.out.log ducc-servlet/log-data?fname=/disk2/ducc/ducc/logs/30696/jd.out.log> > 0.14 S144 8408 Deallocated Voluntary Stopped > ExitCode=0 00 2:15:59:40 00 57 0.0 > 5.0 0.2 6 16 1 14 0 0 0 0 > 10849 696-UIMA-S1-8962.log ducc-servlet/log-data?fname=/disk2/ducc/ducc/logs/30696/ > 30696-UIMA-S144-8962.log> 0.02 S144 8962 Deallocated > Starting > > 2:15:57:46 uima-initialization-report.html?idJob=30696&idPro=10849> 00 > 00 0 0.0 0.0 0.0 > > > > > > > > > 10848 696-UIMA-S1-8503.log ducc-servlet/log-data?fname=/disk2/ducc/ducc/logs/30696/ > 30696-UIMA-S144-8503.log> 0.02 S144 8503 Deallocated > Purged Stopped > ExitCode=0 50 uima-initialization-report.html?idJob=30696&idPro=10848> > 2:15:58:15 02 1102 0.0 46.0 2.3 6 16 1 > 14 0 0 0 0 > 10852 696-UIMA-S2-11649.log ducc-servlet/log-data?fname=/disk2/ducc/ducc/logs/30696/ > 30696-UIMA-S143-11649.log> 0.04 S143 11649 Deallocated > Voluntary Stopped > Discontinued > ExitCode=0 31 uima-initialization-report.html?idJob=30696&idPro=10852> 00 > 02 0 0.0 1.0 2.2 > > > > > > > > > > > > -- > Thanks, > Reshu Agarwal > > --047d7bea34c665f32304fdbfb41b--