lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley (JIRA)" <j...@apache.org>
Subject [jira] Commented: (SOLR-1458) Java Replication error: NullPointerException SEVERE: SnapPull failed on 2009-09-22 nightly
Date Sat, 03 Oct 2009 17:13:23 GMT

    [ https://issues.apache.org/jira/browse/SOLR-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12761916#action_12761916
] 

Yonik Seeley commented on SOLR-1458:
------------------------------------

Seems like if you want to replicate only optimized indexes, we should recommend the user configure
SolrDeletionPolicy to always keep an optimized commit point around.
If you rely on the replication handler to reserve it, it won't work across a reboot, right?

Although I see no reason for custom delete policies, I'm not really against adding support
for that in the replication handler as long as people are confident the changes don't introduce
any new bugs.
Regardless, I think the separate count for optimized commit points that I added to SolrDeletionPolicy
should remain (esp since it fixed other bugs too).

> Java Replication error: NullPointerException SEVERE: SnapPull failed on 2009-09-22 nightly
> ------------------------------------------------------------------------------------------
>
>                 Key: SOLR-1458
>                 URL: https://issues.apache.org/jira/browse/SOLR-1458
>             Project: Solr
>          Issue Type: Bug
>          Components: replication (java)
>    Affects Versions: 1.4
>         Environment: CentOS x64
> 8GB RAM
> Tomcat, running with 7G max memory; memory usage is <2GB, so it's not the problem
> Host a: master
> Host b: slave
> Multiple single core Solr instances, using JNDI.
> Java replication
>            Reporter: Artem Russakovskii
>            Assignee: Yonik Seeley
>             Fix For: 1.4
>
>         Attachments: SOLR-1458.patch, SOLR-1458.patch, SOLR-1458.patch, SOLR-1458.patch,
SOLR-1458.patch, SolrDeletionPolicy.patch, SolrDeletionPolicy.patch
>
>
> After finally figuring out the new Java based replication, we have started both the slave
and the master and issued optimize to all master Solr instances. This triggered some replication
to go through just fine, but it looks like some of it is failing.
> Here's what I'm getting in the slave logs, repeatedly for each shard:
> {code} 
> SEVERE: SnapPull failed 
> java.lang.NullPointerException
>         at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:271)
>         at org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:258)
>         at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>         at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> {code} 
> If I issue an optimize again on the master to one of the shards, it then triggers a replication
and replicates OK. I have a feeling that these SnapPull failures appear later on but right
now I don't have enough to form a pattern.
> Here's replication.properties on one of the failed slave instances.
> {code}
> cat data/replication.properties 
> #Replication details
> #Wed Sep 23 19:35:30 PDT 2009
> replicationFailedAtList=1253759730020,1253759700018,1253759670019,1253759640018,1253759610018,1253759580022,1253759550019,1253759520016,1253759490026,1253759460016
> previousCycleTimeInSeconds=0
> timesFailed=113
> indexReplicatedAtList=1253759730020,1253759700018,1253759670019,1253759640018,1253759610018,1253759580022,1253759550019,1253759520016,1253759490026,1253759460016
> indexReplicatedAt=1253759730020
> replicationFailedAt=1253759730020
> lastCycleBytesDownloaded=0
> timesIndexReplicated=113
> {code}
> and another
> {code}
> cat data/replication.properties 
> #Replication details
> #Wed Sep 23 18:42:01 PDT 2009
> replicationFailedAtList=1253756490034,1253756460169
> previousCycleTimeInSeconds=1
> timesFailed=2
> indexReplicatedAtList=1253756521284,1253756490034,1253756460169
> indexReplicatedAt=1253756521284
> replicationFailedAt=1253756490034
> lastCycleBytesDownloaded=22932293
> timesIndexReplicated=3
> {code}
> Some relevant configs:
> In solrconfig.xml:
> {code}
> <!-- For docs see http://wiki.apache.org/solr/SolrReplication -->
>   <requestHandler name="/replication" class="solr.ReplicationHandler" >
>     <lst name="master">
>         <str name="enable">${enable.master:false}</str>
>         <str name="replicateAfter">optimize</str>
>         <str name="backupAfter">optimize</str>
>         <str name="commitReserveDuration">00:00:20</str>
>     </lst>
>     <lst name="slave">
>         <str name="enable">${enable.slave:false}</str>
>         <!-- url of master, from properties file -->
>         <str name="masterUrl">${master.url}</str>
>         <!-- how often to check master -->
>         <str name="pollInterval">00:00:30</str>
>     </lst>
>   </requestHandler>
> {code}
> The slave then has this in solrcore.properties:
> {code}
> enable.slave=true
> master.url=URLOFMASTER/replication
> {code}
> and the master has
> {code}
> enable.master=true
> {code}
> I'd be glad to provide more details but I'm not sure what else I can do.  SOLR-926 may
be relevant.
> Thanks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message