Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9A07C17A9C for ; Sun, 12 Oct 2014 19:59:50 +0000 (UTC) Received: (qmail 60048 invoked by uid 500); 12 Oct 2014 19:59:45 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 59977 invoked by uid 500); 12 Oct 2014 19:59:45 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 59965 invoked by uid 99); 12 Oct 2014 19:59:45 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 12 Oct 2014 19:59:45 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of apache@elyograg.org designates 166.70.79.219 as permitted sender) Received: from [166.70.79.219] (HELO frodo.elyograg.org) (166.70.79.219) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 12 Oct 2014 19:59:40 +0000 Received: from localhost (localhost [127.0.0.1]) by frodo.elyograg.org (Postfix) with ESMTP id 77C51240A for ; Sun, 12 Oct 2014 13:59:19 -0600 (MDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=elyograg.org; h= content-transfer-encoding:content-type:content-type:in-reply-to :references:subject:subject:mime-version:user-agent:from:from :date:date:message-id:received:received; s=mail; t=1413143958; bh=7FXowtB+5tA+dusw/8l0o4UXqRCbbSJTqVVZkivmlAs=; b=TFq9EfJchBDX NBTbPx81A4+q98R2atKiwbWux2pR0Diqzb5MBcP2WkN1c8da/kvwo40YeNpXwBhs 55bc5osU0z04KgqAToUXJLLzD5n8CmiH14NH4OjPy4M1ljgVxNKlSgle0sptWSoO UTE9+KtL0hnQGXDn4F6vLCKsS5wudnE= X-Virus-Scanned: Debian amavisd-new at frodo.elyograg.org Received: from frodo.elyograg.org ([127.0.0.1]) by localhost (frodo.elyograg.org [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id JDS6dUc9D2xX for ; Sun, 12 Oct 2014 13:59:18 -0600 (MDT) Received: from [192.168.1.102] (102.int.elyograg.org [192.168.1.102]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: elyograg@elyograg.org) by frodo.elyograg.org (Postfix) with ESMTPSA id 928DDBF6 for ; Sun, 12 Oct 2014 13:59:18 -0600 (MDT) Message-ID: <543ADDA4.6050201@elyograg.org> Date: Sun, 12 Oct 2014 13:59:32 -0600 From: Shawn Heisey User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.1.2 MIME-Version: 1.0 To: solr-user@lucene.apache.org Subject: Re: Mismatch in numFound in q=*:* query References: <1413138365776-4163911.post@n3.nabble.com> In-Reply-To: <1413138365776-4163911.post@n3.nabble.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org On 10/12/2014 12:26 PM, vidit.asthana wrote: > I have a strange problem where select q=*:* is returning different number of > documents. Sometime its returning numFound = 5866712 and sometimes it > returns numFound = 5852274. *numFound is always one of these 2 values.* > > Here is the query: > > *http://localhost:5011/solr/mycollection/select?q=*:*&rows=0* > > > I am running Solr in cloud mode and this problem is occurring with both > solr-4.5.1 and solr-4.10.0. I have exactly same data indexed in both > versions. 4.5.1 is running on a 8 nodes cluster (4x2 shards) and solr-4.10.0 > is running on a 4 node (2x2 shards)cluster. I really need to make a wiki page for this. It would save so much typing! I also need to boil it down to a small-scale real-world example and show how the numbers get calculated and what goes wrong, which means I need to have a complete understanding of the problem, and at this moment, I don't have that. This is a problem that's unique to distributed indexes. What causes it is having documents with the same value in the uniqueKey field indexed in more than one shard. It is not a bug, it's a result of the way that results from multiple shards are combined into one result. The only way to "fix" this problem would involve so much additional processing that it would make all queries extremely slow. If you're using automatic document routing, then your routing algorithm may have changed at some point, and you didn't re-index. If you're using manual document routing, then some documents were indexed on the wrong shard, and later indexed on another shard as well. Preventing the problem is easy -- always index documents onto the correct shard. Fixing the problem at this point might involve clearing your index and re-indexing from scratch, unless you can figure out which documents have been indexed on more than one shard and you can delete them from the incorrect shard(s). Thanks, Shawn