lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kaustubh147 <kaustubh.j...@gmail.com>
Subject unable to load core after cluster restart
Date Fri, 01 Nov 2013 03:18:13 GMT
Hi, 

Glassfish 3.1.2.2 
Solr 4.5 
Zookeeper 3.4.5 

We have set up a SolrCloud with 4 Solr nodes and 3 zookeeper instances. 

I start the cluster for the first time with bootstrap_conf= true.... All the
nodes starts property.. I am creating cores (with the same name) on all 4
instances. I can add multiple cores on each of the instances... logically I
have 5 collections.

Now i am creating indexes.. and it automatically creates 4 copies of the
index, one for each instance in appropriate SolrHome directory... It will
work properly untill I restart the Solr cluster

as soon as I restart the cluster, it throws this error (refer below) and
none of the collection works properly...


ERROR - 2013-10-31 19:23:24.411; org.apache.solr.core.CoreContainer; Unable
to create core: xyz
org.apache.solr.common.SolrException: Error opening new searcher
	at org.apache.solr.core.SolrCore.<init>(SolrCore.java:834)
	at org.apache.solr.core.SolrCore.<init>(SolrCore.java:625)
	at org.apache.solr.core.ZkContainer.createFromZk(ZkContainer.java:256)
	at org.apache.solr.core.CoreContainer.create(CoreContainer.java:557)
	at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:249)
	at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:241)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:619)
Caused by: org.apache.solr.common.SolrException: Error opening new searcher
	at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1477)
	at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1589)
	at org.apache.solr.core.SolrCore.<init>(SolrCore.java:821)
	... 13 more
Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain
timed out:
NativeFSLock@/mnt/emc/app_name/data-refresh/SolrCloud/SolrHome1/solr/xyz/data/index/write.lock
	at org.apache.lucene.store.Lock.obtain(Lock.java:84)
	at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:673)
	at org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:77)
	at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64)
	at
org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:267)
	at
org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:110)
	at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1440)
	... 15 more
ERROR - 2013-10-31 19:23:24.420; org.apache.solr.common.SolrException;
null:org.apache.solr.common.SolrException: Unable to create core: xyz
	at
org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:936)
	at org.apache.solr.core.CoreContainer.create(CoreContainer.java:568)
	at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:249)
	at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:241)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:619)
Caused by: org.apache.solr.common.SolrException: Error opening new searcher
	at org.apache.solr.core.SolrCore.<init>(SolrCore.java:834)
	at org.apache.solr.core.SolrCore.<init>(SolrCore.java:625)
	at org.apache.solr.core.ZkContainer.createFromZk(ZkContainer.java:256)
	at org.apache.solr.core.CoreContainer.create(CoreContainer.java:557)
	... 10 more
Caused by: org.apache.solr.common.SolrException: Error opening new searcher
	at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1477)
	at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1589)
	at org.apache.solr.core.SolrCore.<init>(SolrCore.java:821)
	... 13 more
Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain
timed out:
NativeFSLock@/mnt/emc/app_name/data-refresh/SolrCloud/SolrHome1/solr/xyz/data/index/write.lock
	at org.apache.lucene.store.Lock.obtain(Lock.java:84)
	at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:673)
	at org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:77)
	at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64)
	at
org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:267)
	at
org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:110)
	at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1440)
	... 15 more

INFO  - 2013-10-31 19:23:24.421; org.apache.solr.servlet.SolrDispatchFilter;
user.dir=/usr/wbol/glassfish3/glassfish/nodes/localhost-domain1/SolrCloud_01/config
INFO  - 2013-10-31 19:23:24.421; org.apache.solr.servlet.SolrDispatchFilter;
SolrDispatchFilter.init() done
ERROR - 2013-10-31 19:23:24.556; org.apache.solr.update.SolrIndexWriter;
SolrIndexWriter was not closed prior to finalize(), indicates a bug --
POSSIBLE RESOURCE LEAK!!!
ERROR - 2013-10-31 19:23:24.558; org.apache.solr.update.SolrIndexWriter;
Error closing IndexWriter, trying rollback
java.lang.NullPointerException
	at org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:962)
	at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:923)
	at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:885)
	at org.apache.solr.update.SolrIndexWriter.close(SolrIndexWriter.java:132)
	at
org.apache.solr.update.SolrIndexWriter.finalize(SolrIndexWriter.java:185)
	at java.lang.ref.Finalizer.invokeFinalizeMethod(Native Method)
	at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:83)
	at java.lang.ref.Finalizer.access$100(Finalizer.java:14)
	at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:160)
WARN  - 2013-10-31 19:23:24.912; org.apache.solr.cloud.LeaderElector; Failed
setting watch
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode =
NoNode for
/collections/xyz/leader_elect/shard1/election/234764442573733967-core_node2-n_0000000005
	at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
	at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
	at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
	at
org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:252)
	at
org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:249)
	at
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:65)
	at org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:249)
	at
org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:117)
	at org.apache.solr.cloud.LeaderElector.access$000(LeaderElector.java:55)
	at org.apache.solr.cloud.LeaderElector$1.process(LeaderElector.java:129)
	at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
	at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)


-------------------------------------------

It continuously tries to recover but never get success... it also deletes
collection xyz from the zookeeper

Some points to mention---


1. I have removed dataDir from solrconfig.xml as suggested by Shaun here...

http://lucene.472066.n3.nabble.com/Solr-4-3-0-Shard-instances-using-incorrect-data-directory-on-machine-boot-td4063799.html

2. I have provided absolute dataDir path in the core.properties file -
https://issues.apache.org/jira/browse/SOLR-4878

3. InstanceDir in each SolrHome have same name for every core/collection--
for example

SolrHome1/solr/xyz/conf
SolrHome1/solr/xyz/data
SolrHome1/solr/xyz/core.properties
SolrHome1/solr/pqr/conf
SolrHome1/solr/pqr/data
SolrHome1/solr/pqr/core.properties


SolrHome2/solr/xyz/conf
SolrHome2/solr/xyz/data
SolrHome2/solr/xyz/core.properties
SolrHome2/solr/pqr/conf
SolrHome2/solr/pqr/data
SolrHome2/solr/pqr/core.properties

...

3. The 4 SolrHome for each of the instances are on a single shared drive...
but are in different directories

4. All my collections and cores share the same solrconfig.xml 


I am stuck with this problem since long.
Please help.

Thanks,
Kaustubh








--
View this message in context: http://lucene.472066.n3.nabble.com/unable-to-load-core-after-cluster-restart-tp4098731.html
Sent from the Solr - User mailing list archive at Nabble.com.

Mime
View raw message