Hi Karl,

Also if i am trying to kill process which is running from quite some time using ./stop-agents.sh, it is not able to kill the process. It shows correct message but while checking for process, it is still live and running as below. I can see this issue at both the places and assume this may be the reason for already active agent error.

[root@10 multiprocess-file-example]# ./stop-agents.sh

Configuration file successfully read

Shutdown signal sent

[root@10 multiprocess-file-example]# ps -ef|grep java

root      8558     1  3 Jul08 pts/0    00:27:21 /app/alfresco/java/bin/java -Djava.util.logging.config.file=/app/alfresco/tomcat/conf/logging.properties -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -XX:MaxPermSize=512m -Xms2048m -Xmx2048m -XX:+DisableExplicitGC -Djava.awt.headless=true -Dalfresco.home=/app/alfresco -Dcom.sun.management.jmxremote -Dsun.security.ssl.allowUnsafeRenegotiation=true -XX:ReservedCodeCacheSize=128m -Djava.endorsed.dirs=/app/alfresco/tomcat/endorsed -classpath /app/alfresco/tomcat/bin/bootstrap.jar:/app/alfresco/tomcat/bin/tomcat-juli.jar -Dcatalina.base=/app/alfresco/tomcat -Dcatalina.home=/app/alfresco/tomcat -Djava.io.tmpdir=/app/alfresco/tomcat/temp org.apache.catalina.startup.Bootstrap start


root     19426 19424 82 Jul08 ?        13:13:06 /app/alfresco/java/bin/java -Xms256m -Xmx256m -Dorg.apache.manifoldcf.configfile=./properties.xml -cp .:../lib/mcf-pull-agent.jar:../lib/mcf-agents.jar:../lib/mcf-core.jar:../lib/hsqldb.jar:../lib/derbyLocale_zh_TW.jar:../lib/derbyLocale_zh_CN.jar:../lib/derbyLocale_ru.jar:../lib/derbyLocale_pt_BR.jar:../lib/derbyLocale_pl.jar:../lib/derbyLocale_ko_KR.jar:../lib/derbyLocale_ja_JP.jar:../lib/derbyLocale_it.jar:../lib/derbyLocale_hu.jar:../lib/derbyLocale_fr.jar:../lib/derbyLocale_es.jar:../lib/derbyLocale_de_DE.jar:../lib/derbyLocale_cs.jar:../lib/derbytools.jar:../lib/derbynet.jar:../lib/derby.jar:../lib/postgresql.jar:../lib/mail.jar:../lib/slf4j-simple.jar:../lib/slf4j-api.jar:../lib/velocity.jar:../lib/xml-apis.jar:../lib/xercesImpl.jar:../lib/xalan.jar:../lib/servlet-api.jar:../lib/serializer.jar:../lib/log4j.jar:../lib/commons-logging.jar:../lib/commons-lang.jar:../lib/commons-io.jar:../lib/httpclient.jar:../lib/httpcore.jar:../lib/commons-fileupload.jar:../lib/commons-el.jar:../lib/commons-collections.jar:../lib/commons-codec.jar:../lib/json-simple.jar:../lib/json.jar:../lib/zookeeper.jar: -Dorg.apache.manifoldcf.processid=B org.apache.manifoldcf.agents.AgentRun



Regards.


On Wed, Jul 9, 2014 at 5:17 AM, lalit jangra <lalit.j.jangra@gmail.com> wrote:
Thanks Karl,

I am currently running only single agent process on single machine without clustering. I have two environments and i could see this issue coming up at both places.

While trying to start agent, i could see below error. When i am trying to start agent, it says below error and exits. But there is no agent process already running.

[root@server1 multiprocess-file-example]# ./start-agents.sh &

[1] 5020

[root@server1 multiprocess-file-example]# Running...

Configuration file successfully read

org.apache.manifoldcf.core.interfaces.ManifoldCFException: Service 'A' of type 'AGENT' is already active

        at org.apache.manifoldcf.core.lockmanager.BaseLockManager.registerServiceBeginServiceActivity(BaseLockManager.java:156)

        at org.apache.manifoldcf.core.lockmanager.BaseLockManager.registerServiceBeginServiceActivity(BaseLockManager.java:120)

        at org.apache.manifoldcf.core.lockmanager.LockManager.registerServiceBeginServiceActivity(LockManager.java:69)

        at org.apache.manifoldcf.agents.AgentRun.doExecute(AgentRun.java:54)

        at org.apache.manifoldcf.agents.BaseAgentsInitializationCommand.execute(BaseAgentsInitializationCommand.java:37)

        at org.apache.manifoldcf.agents.AgentRun.main(AgentRun.java:93)

 

[1]+  Exit 1                  ./start-agents.sh



If i am able to successfully start agent using ./start-agents.sh, still it throws same error but i have no other process running.

Regards.


On Tue, Jul 8, 2014 at 2:38 PM, Karl Wright <daddywri@gmail.com> wrote:
Hi lalit,

This occurs when you have more than one agents process with the same
process id using the same shared file system directory / zookeeper
cluster.  There is no other way it can occur.

Thanks,
Karl

Sent from my Windows Phone

-----Original Message-----
From: lalit jangra
Sent: 7/8/2014 8:38 AM
To: user@manifoldcf.apache.org
Subject: Re: Apache ManifoldCF job stuck up



Thanks Karl,


I have tried steps you suggested and it worked on one instance.

But on another instance i am still not able to resolve this issue.
Along with steps you mentioned, i tried recreating DB instance again,
setting up new MCF instance , cleaning locks and then starting agents
first & then tomcat. But issue still persists.

If i try to run ./start-agents.sh, i get this error for agent A.


ERROR 2014-07-08 13:32:19,823 (Agents thread) - Exception tossed:
Service 'A' of type
'AGENT_org.apache.manifoldcf.crawler.system.CrawlerAgent' is already
active

org.apache.manifoldcf.core.interfaces.ManifoldCFException: Service 'A'
of type 'AGENT_org.apache.manifoldcf.crawler.system.CrawlerAgent' is
already active

        at org.apache.manifoldcf.core.lockmanager.BaseLockManager.registerServiceBeginServiceActivity(BaseLockManager.java:156)

        at org.apache.manifoldcf.core.lockmanager.BaseLockManager.registerServiceBeginServiceActivity(BaseLockManager.java:120)

        at org.apache.manifoldcf.core.lockmanager.LockManager.registerServiceBeginServiceActivity(LockManager.java:69)

        at org.apache.manifoldcf.agents.system.AgentsDaemon.checkAgents(AgentsDaemon.java:270)




But if i try to run ./start-agents-2.sh, i see similar error but agent B





ERROR 2014-07-08 13:32:19,823 (Agents thread) - Exception tossed:
Service 'B' of type
'AGENT_org.apache.manifoldcf.crawler.system.CrawlerAgent' is already
active

org.apache.manifoldcf.core.interfaces.ManifoldCFException: Service 'B'
of type 'AGENT_org.apache.manifoldcf.crawler.system.CrawlerAgent' is
already active

        at org.apache.manifoldcf.core.lockmanager.BaseLockManager.registerServiceBeginServiceActivity(BaseLockManager.java:156)

        at org.apache.manifoldcf.core.lockmanager.BaseLockManager.registerServiceBeginServiceActivity(BaseLockManager.java:120)

        at org.apache.manifoldcf.core.lockmanager.LockManager.registerServiceBeginServiceActivity(LockManager.java:69)

        at org.apache.manifoldcf.agents.system.AgentsDaemon.checkAgents(AgentsDaemon.java:270)




Regards.





On Mon, Jul 7, 2014 at 4:55 PM, Karl Wright <daddywri@gmail.com> wrote:




Hi Lalit,


If you are using file synchronization, you cannot expect MCF to clean
up itself unless you shut it down cleanly.  You should be either using
^C or plain kill, NEVER kill -9.  kill -9 will leave dangling locks.



To clean up dangling locks:



- shut ALL manifoldcf processes and web apps down


- run the lock-clean script


- start up the processes again



 Zookeeper synchronization, by the way, does not have this kind of problem.

Thanks,
Karl








On Mon, Jul 7, 2014 at 11:49 AM, lalit jangra <lalit.j.jangra@gmail.com> wrote:




Hi,


I configured MCF 1.5.1 to run with Postgresql DB and tomcat 7.
Initially i created all connections, created a an alfresco job and it
all worked fine.


Next for updates, i stopped, tomcat and agent process running. Then i
updated CmisRepositoryConnector.java with my own code and run "ant
build" at root of MCF. It updated all code and jars file.



Also properties.xml is reset under /dist/multiprocess-file/example
which i updated to connect to Postgresql DB and logging
configurations.



I started tomcat then agent process.  Finally i started job to crawl
alfresco but it got stuck and moving on. I checked into
/dist/multiprocess-file/example/logs/manifoldcf.log file and could see
below error.


ERROR 2014-07-07 16:09:04,936 (Agents thread) - Exception tossed:
Service '' of type
'AGENT_org.apache.manifoldcf.crawler.system.CrawlerAgent' is already
active

org.apache.manifoldcf.core.interfaces.ManifoldCFException: Service ''
of type 'AGENT_org.apache.manifoldcf.crawler.system.CrawlerAgent' is
already active

        at org.apache.manifoldcf.core.lockmanager.BaseLockManager.registerServiceBeginServiceActivity(BaseLockManager.java:156)

        at org.apache.manifoldcf.core.lockmanager.BaseLockManager.registerServiceBeginServiceActivity(BaseLockManager.java:120)

        at org.apache.manifoldcf.core.lockmanager.LockManager.registerServiceBeginServiceActivity(LockManager.java:69)

        at org.apache.manifoldcf.agents.system.AgentsDaemon.checkAgents(AgentsDaemon.java:270)

        at org.apache.manifoldcf.agents.system.AgentsDaemon$AgentsThread.run(AgentsDaemon.java:208)









I created another job but that got stuck too.Is DB got corrupt due to rebuild?




Also is this the right way to build MCF (I hope its correct way). Now
what should i do to fix this issue?





Please help.

Regards,
Lalit Jangra.




--
Regards,
Lalit Jangra.



--
Regards,
Lalit Jangra.



--
Regards,
Lalit Jangra.