lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Miller (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica
Date Fri, 15 Nov 2013 19:09:23 GMT

    [ https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13823959#comment-13823959
] 

Mark Miller commented on SOLR-4260:
-----------------------------------

bq. My problem is that I don't know how deal with the situation. I'm using a non java client
and I don't know how I can do debug or the tools that I can use to give some information to
help debug this issue.

I'm hoping to put together a guide on debugging some of the tests for this sort of thing soon.
Perhaps some of that will also be useful for debugging a live installation. I sure could use
the help - I have a lot on my plate at the moment. 

> Inconsistent numDocs between leader and replica
> -----------------------------------------------
>
>                 Key: SOLR-4260
>                 URL: https://issues.apache.org/jira/browse/SOLR-4260
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: 5.0
>         Environment: 5.0.0.2013.01.04.15.31.51
>            Reporter: Markus Jelsma
>            Priority: Critical
>             Fix For: 5.0
>
>         Attachments: 192.168.20.102-replica1.png, 192.168.20.104-replica2.png, clusterstate.png
>
>
> After wiping all cores and reindexing some 3.3 million docs from Nutch using CloudSolrServer
we see inconsistencies between the leader and replica for some shards.
> Each core hold about 3.3k documents. For some reason 5 out of 10 shards have a small
deviation in then number of documents. The leader and slave deviate for roughly 10-20 documents,
not more.
> Results hopping ranks in the result set for identical queries got my attention, there
were small IDF differences for exactly the same record causing a record to shift positions
in the result set. During those tests no records were indexed. Consecutive catch all queries
also return different number of numDocs.
> We're running a 10 node test cluster with 10 shards and a replication factor of two and
frequently reindex using a fresh build from trunk. I've not seen this issue for quite some
time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message