Return-Path: X-Original-To: apmail-manifoldcf-user-archive@www.apache.org Delivered-To: apmail-manifoldcf-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3D6F4116EF for ; Wed, 9 Jul 2014 09:05:08 +0000 (UTC) Received: (qmail 31639 invoked by uid 500); 9 Jul 2014 09:05:08 -0000 Delivered-To: apmail-manifoldcf-user-archive@manifoldcf.apache.org Received: (qmail 31586 invoked by uid 500); 9 Jul 2014 09:05:08 -0000 Mailing-List: contact user-help@manifoldcf.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@manifoldcf.apache.org Delivered-To: mailing list user@manifoldcf.apache.org Received: (qmail 31576 invoked by uid 99); 9 Jul 2014 09:05:08 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Jul 2014 09:05:08 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of daddywri@gmail.com designates 209.85.216.48 as permitted sender) Received: from [209.85.216.48] (HELO mail-qa0-f48.google.com) (209.85.216.48) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Jul 2014 09:05:03 +0000 Received: by mail-qa0-f48.google.com with SMTP id x12so5631763qac.21 for ; Wed, 09 Jul 2014 02:04:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:cc:content-type; bh=2PywkrpKv6ww1nnDV3eeciROqS5koboL/TBR8oWygGo=; b=xfZPyb30HkKdwfA0t1M3fwlb/RlCtnVdHCyjlKyDG2k9uotHZl8aKcmVLfhD6Id+7w MCSg/diIvN/FdveWojsiK8D6q2ZkPIWwC+uPsQVUyDlEK0C7pVflPqtaORxe61UpeFR6 iJDhWyL0gT7XMt78TyRxBQZy3wuifNy45eC+b2mWus1VdHdV3CpdGAjIi4KQhu9XZZRI c1WtehCKWbBXnC/wIJFnoX6V+fKTjzImtzqwiKyZnL+jVHMYqYtT/uMiai/egREpaNHv ckLAgM6w0U5+r4kNBUSSNpO3vyB3yLe+bPTiUjFfElIeQHZrtXuAbJc/yogmk06sIur4 d+mg== X-Received: by 10.140.101.115 with SMTP id t106mr62861126qge.91.1404896682929; Wed, 09 Jul 2014 02:04:42 -0700 (PDT) MIME-Version: 1.0 From: Karl Wright Date: Wed, 9 Jul 2014 05:03:55 -0400 Message-ID: <2810426633623221375@unknownmsgid> Subject: RE: Apache ManifoldCF job stuck up To: lalit jangra Cc: "user@manifoldcf.apache.org" Content-Type: text/plain; charset=UTF-8 X-Virus-Checked: Checked by ClamAV on apache.org So, lalit, if you run the multiprocess example without any changes, do you see this? I don't. Karl Sent from my Windows Phone -----Original Message----- From: lalit jangra Sent: 7/9/2014 12:17 AM To: Karl Wright Cc: user@manifoldcf.apache.org Subject: Re: Apache ManifoldCF job stuck up Thanks Karl, I am currently running only single agent process on single machine without clustering. I have two environments and i could see this issue coming up at both places. While trying to start agent, i could see below error. When i am trying to start agent, it says below error and exits. But there is no agent process already running. [root@server1 multiprocess-file-example]# ./start-agents.sh & [1] 5020 [root@server1 multiprocess-file-example]# Running... Configuration file successfully read org.apache.manifoldcf.core.interfaces.ManifoldCFException: Service 'A' of type 'AGENT' is already active at org.apache.manifoldcf.core.lockmanager.BaseLockManager.registerServiceBeginServiceActivity(BaseLockManager.java:156) at org.apache.manifoldcf.core.lockmanager.BaseLockManager.registerServiceBeginServiceActivity(BaseLockManager.java:120) at org.apache.manifoldcf.core.lockmanager.LockManager.registerServiceBeginServiceActivity(LockManager.java:69) at org.apache.manifoldcf.agents.AgentRun.doExecute(AgentRun.java:54) at org.apache.manifoldcf.agents.BaseAgentsInitializationCommand.execute(BaseAgentsInitializationCommand.java:37) at org.apache.manifoldcf.agents.AgentRun.main(AgentRun.java:93) [1]+ Exit 1 ./start-agents.sh If i am able to successfully start agent using ./start-agents.sh, still it throws same error but i have no other process running. Regards. On Tue, Jul 8, 2014 at 2:38 PM, Karl Wright wrote: Hi lalit, This occurs when you have more than one agents process with the same process id using the same shared file system directory / zookeeper cluster. There is no other way it can occur. Thanks, Karl Sent from my Windows Phone -----Original Message----- From: lalit jangra Sent: 7/8/2014 8:38 AM To: user@manifoldcf.apache.org Subject: Re: Apache ManifoldCF job stuck up Thanks Karl, I have tried steps you suggested and it worked on one instance. But on another instance i am still not able to resolve this issue. Along with steps you mentioned, i tried recreating DB instance again, setting up new MCF instance , cleaning locks and then starting agents first & then tomcat. But issue still persists. If i try to run ./start-agents.sh, i get this error for agent A. ERROR 2014-07-08 13:32:19,823 (Agents thread) - Exception tossed: Service 'A' of type 'AGENT_org.apache.manifoldcf.crawler.system.CrawlerAgent' is already active org.apache.manifoldcf.core.interfaces.ManifoldCFException: Service 'A' of type 'AGENT_org.apache.manifoldcf.crawler.system.CrawlerAgent' is already active at org.apache.manifoldcf.core.lockmanager.BaseLockManager.registerServiceBeginServiceActivity(BaseLockManager.java:156) at org.apache.manifoldcf.core.lockmanager.BaseLockManager.registerServiceBeginServiceActivity(BaseLockManager.java:120) at org.apache.manifoldcf.core.lockmanager.LockManager.registerServiceBeginServiceActivity(LockManager.java:69) at org.apache.manifoldcf.agents.system.AgentsDaemon.checkAgents(AgentsDaemon.java:270) But if i try to run ./start-agents-2.sh, i see similar error but agent B ERROR 2014-07-08 13:32:19,823 (Agents thread) - Exception tossed: Service 'B' of type 'AGENT_org.apache.manifoldcf.crawler.system.CrawlerAgent' is already active org.apache.manifoldcf.core.interfaces.ManifoldCFException: Service 'B' of type 'AGENT_org.apache.manifoldcf.crawler.system.CrawlerAgent' is already active at org.apache.manifoldcf.core.lockmanager.BaseLockManager.registerServiceBeginServiceActivity(BaseLockManager.java:156) at org.apache.manifoldcf.core.lockmanager.BaseLockManager.registerServiceBeginServiceActivity(BaseLockManager.java:120) at org.apache.manifoldcf.core.lockmanager.LockManager.registerServiceBeginServiceActivity(LockManager.java:69) at org.apache.manifoldcf.agents.system.AgentsDaemon.checkAgents(AgentsDaemon.java:270) Regards. On Mon, Jul 7, 2014 at 4:55 PM, Karl Wright wrote: Hi Lalit, If you are using file synchronization, you cannot expect MCF to clean up itself unless you shut it down cleanly. You should be either using ^C or plain kill, NEVER kill -9. kill -9 will leave dangling locks. To clean up dangling locks: - shut ALL manifoldcf processes and web apps down - run the lock-clean script - start up the processes again Zookeeper synchronization, by the way, does not have this kind of problem. Thanks, Karl On Mon, Jul 7, 2014 at 11:49 AM, lalit jangra wrote: Hi, I configured MCF 1.5.1 to run with Postgresql DB and tomcat 7. Initially i created all connections, created a an alfresco job and it all worked fine. Next for updates, i stopped, tomcat and agent process running. Then i updated CmisRepositoryConnector.java with my own code and run "ant build" at root of MCF. It updated all code and jars file. Also properties.xml is reset under /dist/multiprocess-file/example which i updated to connect to Postgresql DB and logging configurations. I started tomcat then agent process. Finally i started job to crawl alfresco but it got stuck and moving on. I checked into /dist/multiprocess-file/example/logs/manifoldcf.log file and could see below error. ERROR 2014-07-07 16:09:04,936 (Agents thread) - Exception tossed: Service '' of type 'AGENT_org.apache.manifoldcf.crawler.system.CrawlerAgent' is already active org.apache.manifoldcf.core.interfaces.ManifoldCFException: Service '' of type 'AGENT_org.apache.manifoldcf.crawler.system.CrawlerAgent' is already active at org.apache.manifoldcf.core.lockmanager.BaseLockManager.registerServiceBeginServiceActivity(BaseLockManager.java:156) at org.apache.manifoldcf.core.lockmanager.BaseLockManager.registerServiceBeginServiceActivity(BaseLockManager.java:120) at org.apache.manifoldcf.core.lockmanager.LockManager.registerServiceBeginServiceActivity(LockManager.java:69) at org.apache.manifoldcf.agents.system.AgentsDaemon.checkAgents(AgentsDaemon.java:270) at org.apache.manifoldcf.agents.system.AgentsDaemon$AgentsThread.run(AgentsDaemon.java:208) I created another job but that got stuck too.Is DB got corrupt due to rebuild? Also is this the right way to build MCF (I hope its correct way). Now what should i do to fix this issue? Please help. Regards, Lalit Jangra. -- Regards, Lalit Jangra. -- Regards, Lalit Jangra.