Return-Path: X-Original-To: apmail-manifoldcf-user-archive@www.apache.org Delivered-To: apmail-manifoldcf-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6336A11E85 for ; Wed, 9 Jul 2014 04:18:10 +0000 (UTC) Received: (qmail 3180 invoked by uid 500); 9 Jul 2014 04:18:10 -0000 Delivered-To: apmail-manifoldcf-user-archive@manifoldcf.apache.org Received: (qmail 3123 invoked by uid 500); 9 Jul 2014 04:18:10 -0000 Mailing-List: contact user-help@manifoldcf.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@manifoldcf.apache.org Delivered-To: mailing list user@manifoldcf.apache.org Received: (qmail 3112 invoked by uid 99); 9 Jul 2014 04:18:10 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Jul 2014 04:18:10 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of lalit.j.jangra@gmail.com designates 209.85.215.46 as permitted sender) Received: from [209.85.215.46] (HELO mail-la0-f46.google.com) (209.85.215.46) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Jul 2014 04:18:07 +0000 Received: by mail-la0-f46.google.com with SMTP id el20so4614437lab.19 for ; Tue, 08 Jul 2014 21:17:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=q16JVgAnuvHK3lOaxK5KYtrNbp9NRzRw0HJuhYsSxz8=; b=ZGgy2E3lK2KnXZdFIe1jekkBL27/PVcXjVy/WD/G0BlzOOFOstVJuEutqwboTBsGjk TFi2ngCKaukf4LmKvLCel0CXTLmxeKHaTazYv7hoSPh0G808Eh9fxjV9BwA/4g64X49w Ku0cJ8mlsrZT7bHxqCRFXWxu+SXxCMWjohBp74PCLeZOkgXAmMzH4qNP4t7hTJINBLNt 38UVJkjskKB36OrhDhnK4UiCs26UFiaU6lv4yqLXFcZI2jFP2qdlfxqF2YmZPwBpHvky BxRXGb7bzKqTcx9XIXTipEg/bPaRwtQO8/YWS+sE8PhrhEnbAssyl3eGXK+20M0+4N+3 Ta0A== MIME-Version: 1.0 X-Received: by 10.112.97.163 with SMTP id eb3mr34296lbb.67.1404879462562; Tue, 08 Jul 2014 21:17:42 -0700 (PDT) Received: by 10.114.118.100 with HTTP; Tue, 8 Jul 2014 21:17:42 -0700 (PDT) In-Reply-To: <-9072200989631651127@unknownmsgid> References: <-9072200989631651127@unknownmsgid> Date: Wed, 9 Jul 2014 05:17:42 +0100 Message-ID: Subject: Re: Apache ManifoldCF job stuck up From: lalit jangra To: Karl Wright Cc: "user@manifoldcf.apache.org" Content-Type: multipart/alternative; boundary=001a1133c20a51825f04fdbafd45 X-Virus-Checked: Checked by ClamAV on apache.org --001a1133c20a51825f04fdbafd45 Content-Type: text/plain; charset=UTF-8 Thanks Karl, I am currently running only single agent process on single machine without clustering. I have two environments and i could see this issue coming up at both places. While trying to start agent, i could see below error. When i am trying to start agent, it says below error and exits. But there is no agent process already running. [root@server1 multiprocess-file-example]# ./start-agents.sh & [1] 5020 [root@server1 multiprocess-file-example]# Running... Configuration file successfully read org.apache.manifoldcf.core.interfaces.ManifoldCFException: Service 'A' of type 'AGENT' is already active at org.apache.manifoldcf.core.lockmanager.BaseLockManager.registerServiceBeginServiceActivity(BaseLockManager.java:156) at org.apache.manifoldcf.core.lockmanager.BaseLockManager.registerServiceBeginServiceActivity(BaseLockManager.java:120) at org.apache.manifoldcf.core.lockmanager.LockManager.registerServiceBeginServiceActivity(LockManager.java:69) at org.apache.manifoldcf.agents.AgentRun.doExecute(AgentRun.java:54) at org.apache.manifoldcf.agents.BaseAgentsInitializationCommand.execute(BaseAgentsInitializationCommand.java:37) at org.apache.manifoldcf.agents.AgentRun.main(AgentRun.java:93) [1]+ Exit 1 ./start-agents.sh If i am able to successfully start agent using ./start-agents.sh, still it throws same error but i have no other process running. Regards. On Tue, Jul 8, 2014 at 2:38 PM, Karl Wright wrote: > Hi lalit, > > This occurs when you have more than one agents process with the same > process id using the same shared file system directory / zookeeper > cluster. There is no other way it can occur. > > Thanks, > Karl > > Sent from my Windows Phone > > -----Original Message----- > From: lalit jangra > Sent: 7/8/2014 8:38 AM > To: user@manifoldcf.apache.org > Subject: Re: Apache ManifoldCF job stuck up > > > > Thanks Karl, > > > I have tried steps you suggested and it worked on one instance. > > But on another instance i am still not able to resolve this issue. > Along with steps you mentioned, i tried recreating DB instance again, > setting up new MCF instance , cleaning locks and then starting agents > first & then tomcat. But issue still persists. > > If i try to run ./start-agents.sh, i get this error for agent A. > > > ERROR 2014-07-08 13:32:19,823 (Agents thread) - Exception tossed: > Service 'A' of type > 'AGENT_org.apache.manifoldcf.crawler.system.CrawlerAgent' is already > active > > org.apache.manifoldcf.core.interfaces.ManifoldCFException: Service 'A' > of type 'AGENT_org.apache.manifoldcf.crawler.system.CrawlerAgent' is > already active > > at > org.apache.manifoldcf.core.lockmanager.BaseLockManager.registerServiceBeginServiceActivity(BaseLockManager.java:156) > > at > org.apache.manifoldcf.core.lockmanager.BaseLockManager.registerServiceBeginServiceActivity(BaseLockManager.java:120) > > at > org.apache.manifoldcf.core.lockmanager.LockManager.registerServiceBeginServiceActivity(LockManager.java:69) > > at > org.apache.manifoldcf.agents.system.AgentsDaemon.checkAgents(AgentsDaemon.java:270) > > > > > But if i try to run ./start-agents-2.sh, i see similar error but agent B > > > > > > ERROR 2014-07-08 13:32:19,823 (Agents thread) - Exception tossed: > Service 'B' of type > 'AGENT_org.apache.manifoldcf.crawler.system.CrawlerAgent' is already > active > > org.apache.manifoldcf.core.interfaces.ManifoldCFException: Service 'B' > of type 'AGENT_org.apache.manifoldcf.crawler.system.CrawlerAgent' is > already active > > at > org.apache.manifoldcf.core.lockmanager.BaseLockManager.registerServiceBeginServiceActivity(BaseLockManager.java:156) > > at > org.apache.manifoldcf.core.lockmanager.BaseLockManager.registerServiceBeginServiceActivity(BaseLockManager.java:120) > > at > org.apache.manifoldcf.core.lockmanager.LockManager.registerServiceBeginServiceActivity(LockManager.java:69) > > at > org.apache.manifoldcf.agents.system.AgentsDaemon.checkAgents(AgentsDaemon.java:270) > > > > > Regards. > > > > > > On Mon, Jul 7, 2014 at 4:55 PM, Karl Wright wrote: > > > > > Hi Lalit, > > > If you are using file synchronization, you cannot expect MCF to clean > up itself unless you shut it down cleanly. You should be either using > ^C or plain kill, NEVER kill -9. kill -9 will leave dangling locks. > > > > To clean up dangling locks: > > > > - shut ALL manifoldcf processes and web apps down > > > - run the lock-clean script > > > - start up the processes again > > > > Zookeeper synchronization, by the way, does not have this kind of problem. > > Thanks, > Karl > > > > > > > > > On Mon, Jul 7, 2014 at 11:49 AM, lalit jangra > wrote: > > > > > Hi, > > > I configured MCF 1.5.1 to run with Postgresql DB and tomcat 7. > Initially i created all connections, created a an alfresco job and it > all worked fine. > > > Next for updates, i stopped, tomcat and agent process running. Then i > updated CmisRepositoryConnector.java with my own code and run "ant > build" at root of MCF. It updated all code and jars file. > > > > Also properties.xml is reset under /dist/multiprocess-file/example > which i updated to connect to Postgresql DB and logging > configurations. > > > > I started tomcat then agent process. Finally i started job to crawl > alfresco but it got stuck and moving on. I checked into > /dist/multiprocess-file/example/logs/manifoldcf.log file and could see > below error. > > > ERROR 2014-07-07 16:09:04,936 (Agents thread) - Exception tossed: > Service '' of type > 'AGENT_org.apache.manifoldcf.crawler.system.CrawlerAgent' is already > active > > org.apache.manifoldcf.core.interfaces.ManifoldCFException: Service '' > of type 'AGENT_org.apache.manifoldcf.crawler.system.CrawlerAgent' is > already active > > at > org.apache.manifoldcf.core.lockmanager.BaseLockManager.registerServiceBeginServiceActivity(BaseLockManager.java:156) > > at > org.apache.manifoldcf.core.lockmanager.BaseLockManager.registerServiceBeginServiceActivity(BaseLockManager.java:120) > > at > org.apache.manifoldcf.core.lockmanager.LockManager.registerServiceBeginServiceActivity(LockManager.java:69) > > at > org.apache.manifoldcf.agents.system.AgentsDaemon.checkAgents(AgentsDaemon.java:270) > > at > org.apache.manifoldcf.agents.system.AgentsDaemon$AgentsThread.run(AgentsDaemon.java:208) > > > > > > > > > > I created another job but that got stuck too.Is DB got corrupt due to > rebuild? > > > > > Also is this the right way to build MCF (I hope its correct way). Now > what should i do to fix this issue? > > > > > > Please help. > > Regards, > Lalit Jangra. > > > > > -- > Regards, > Lalit Jangra. > -- Regards, Lalit Jangra. --001a1133c20a51825f04fdbafd45 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Thanks Karl,

I am current= ly running only single agent process on single machine without clustering. = I have two environments and i could see this issue coming up at both places= .

While trying to start agent, i could see below error. When i am t= rying to start agent, it says below error and exits. But there is no agent = process already running.

[root@server1 multiprocess-file-example]# ./start-agents.sh &

[1] 5020

[root@server1 multiprocess-file-example]# Running...

Configuration file successfully read

org.apache.manifoldcf.cor= e.interfaces.ManifoldCFException: Service 'A' of type 'AGENT' is already active

=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 at org.apache.manifoldcf.core.lockmanager.BaseLockManager.registerServiceBe= ginServiceActivity(BaseLockManager.java:156)

=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 at org.apache.manifoldcf.core.lockmanager.BaseLockManager.registerServiceBegin= ServiceActivity(BaseLockManager.java:120)

=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 at org.apache.manifoldcf.core.lockmanager.LockManager.registerServiceBeginS= erviceActivity(LockManager.java:69)

=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 at org.apache.manifoldcf.agents.AgentRun.doExecute(AgentRun.java:54)=

=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 at org.apache.manifoldcf.agents.BaseAgentsInitializationCommand.execute(BaseAg= entsInitializationCommand.java:37)

=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 at org.apache.manifoldcf.agents.AgentRun.main(AgentRun.java:93)

=C2=A0

[1]+=C2=A0 Exit 1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ./start-agents.sh



If i am able to successfully start agent using ./start-agents.sh, still it throws same error but i have no o= ther process running.

Regard= s.


On Tue= , Jul 8, 2014 at 2:38 PM, Karl Wright <daddywri@gmail.com> = wrote:
Hi lalit,

This occurs when you have more than one agents process with the same
process id using the same shared file system directory / zookeeper
cluster. =C2=A0There is no other way it can occur.

Thanks,
Karl

Sent from my Windows Phone

-----Original Message-----
From: lalit jangra
Sent: 7/8/2014 8:38 AM
To: user@manifoldcf.apache.or= g
Subject: Re: Apache ManifoldCF job stuck up



Thanks Karl,


I have tried steps you suggested and it worked on one instance.

But on another instance i am still not able to resolve this issue.
Along with steps you mentioned, i tried recreating DB instance again,
setting up new MCF instance , cleaning locks and then starting agents
first & then tomcat. But issue still persists.

If i try to run ./start-agents.sh, i get this error for agent A.


ERROR 2014-07-08 13:32:19,823 (Agents thread) - Exception tossed:
Service 'A' of type
'AGENT_org.apache.manifoldcf.crawler.system.CrawlerAgent' is alread= y
active

org.apache.manifoldcf.core.interfaces.ManifoldCFException: Service 'A&#= 39;
of type 'AGENT_org.apache.manifoldcf.crawler.system.CrawlerAgent' i= s
already active

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.manifoldcf.core.lockmanager.BaseL= ockManager.registerServiceBeginServiceActivity(BaseLockManager.java:156)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.manifoldcf.core.lockmanager.BaseL= ockManager.registerServiceBeginServiceActivity(BaseLockManager.java:120)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.manifoldcf.core.lockmanager.LockM= anager.registerServiceBeginServiceActivity(LockManager.java:69)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.manifoldcf.agents.system.AgentsDa= emon.checkAgents(AgentsDaemon.java:270)




But if i try to run ./start-agents-2.sh, i see similar error but agent B




ERROR 2014-07-08 13:32:19,823 (Agents thread) - Exception tossed:
Service 'B' of type
'AGENT_org.apache.manifoldcf.crawler.system.CrawlerAgent' is alread= y
active

org.apache.manifoldcf.core.interfaces.ManifoldCFException: Service 'B&#= 39;
of type 'AGENT_org.apache.manifoldcf.crawler.system.CrawlerAgent' i= s
already active

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.manifoldcf.core.lockmanager.BaseL= ockManager.registerServiceBeginServiceActivity(BaseLockManager.java:156)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.manifoldcf.core.lockmanager.BaseL= ockManager.registerServiceBeginServiceActivity(BaseLockManager.java:120)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.manifoldcf.core.lockmanager.LockM= anager.registerServiceBeginServiceActivity(LockManager.java:69)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.manifoldcf.agents.system.AgentsDa= emon.checkAgents(AgentsDaemon.java:270)




Regards.





On Mon, Jul 7, 2014 at 4:55 PM, Karl Wright <daddywri@gmail.com> wrote:




Hi Lalit,


If you are using file synchronization, you cannot expect MCF to clean
up itself unless you shut it down cleanly. =C2=A0You should be either using=
^C or plain kill, NEVER kill -9. =C2=A0kill -9 will leave dangling locks.


To clean up dangling locks:



- shut ALL manifoldcf processes and web apps down


- run the lock-clean script


- start up the processes again



=C2=A0Zookeeper synchronization, by the way, does not have this kind of pro= blem.

Thanks,
Karl








On Mon, Jul 7, 2014 at 11:49 AM, lalit jangra <lalit.j.jangra@gmail.com> wrote:




Hi,


I configured MCF 1.5.1 to run with Postgresql DB and tomcat 7.
Initially i created all connections, created a an alfresco job and it
all worked fine.


Next for updates, i stopped, tomcat and agent process running. Then i
updated CmisRepositoryConnector.java with my own code and run "ant
build" at root of MCF. It updated all code and jars file.



Also properties.xml is reset under /dist/multiprocess-file/example
which i updated to connect to Postgresql DB and logging
configurations.



I started tomcat then agent process. =C2=A0Finally i started job to crawl alfresco but it got stuck and moving on. I checked into
/dist/multiprocess-file/example/logs/manifoldcf.log file and could see
below error.


ERROR 2014-07-07 16:09:04,936 (Agents thread) - Exception tossed:
Service '' of type
'AGENT_org.apache.manifoldcf.crawler.system.CrawlerAgent' is alread= y
active

org.apache.manifoldcf.core.interfaces.ManifoldCFException: Service '= 9;
of type 'AGENT_org.apache.manifoldcf.crawler.system.CrawlerAgent' i= s
already active

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.manifoldcf.core.lockmanager.BaseL= ockManager.registerServiceBeginServiceActivity(BaseLockManager.java:156)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.manifoldcf.core.lockmanager.BaseL= ockManager.registerServiceBeginServiceActivity(BaseLockManager.java:120)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.manifoldcf.core.lockmanager.LockM= anager.registerServiceBeginServiceActivity(LockManager.java:69)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.manifoldcf.agents.system.AgentsDa= emon.checkAgents(AgentsDaemon.java:270)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.manifoldcf.agents.system.AgentsDa= emon$AgentsThread.run(AgentsDaemon.java:208)









I created another job but that got stuck too.Is DB got corrupt due to rebui= ld?




Also is this the right way to build MCF (I hope its correct way). Now
what should i do to fix this issue?





Please help.

Regards,
Lalit Jangra.




--
Regards,
Lalit Jangra.



--
Regards,Lalit Jangra.
--001a1133c20a51825f04fdbafd45--