Return-Path: Delivered-To: apmail-lucene-solr-user-archive@locus.apache.org Received: (qmail 79165 invoked from network); 3 Nov 2008 21:48:03 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 3 Nov 2008 21:48:03 -0000 Received: (qmail 64796 invoked by uid 500); 3 Nov 2008 21:48:06 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 64400 invoked by uid 500); 3 Nov 2008 21:48:06 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 64388 invoked by uid 99); 3 Nov 2008 21:48:06 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Nov 2008 13:48:06 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of adhir@symplicity.com designates 66.151.109.78 as permitted sender) Received: from [66.151.109.78] (HELO mail.symplicity.com) (66.151.109.78) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Nov 2008 21:46:48 +0000 Received: from [10.120.100.6] (unknown [10.120.100.6]) by mail.symplicity.com (Postfix) with ESMTPSA id 122812BE7B for ; Mon, 3 Nov 2008 16:47:28 -0500 (EST) Message-Id: From: Alok Dhir To: solr-user@lucene.apache.org In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v929.2) Subject: SOLR Performance Date: Mon, 3 Nov 2008 16:47:27 -0500 References: X-Mailer: Apple Mail (2.929.2) X-Virus-Checked: Checked by ClamAV on apache.org We've moved past this issue by reducing date precision -- thanks to all for the help. Now we're at another problem. There is relatively constant updating of the index -- new log entries are pumped in from several applications continuously. Obviously, new entries do not appear in searches until after a commit occurs. The problem is, issuing a commit causes searches to come to a screeching halt for up to 2 minutes. We're up to around 80M docs. Index size is 27G. The number of docs will soon be 800M, which doesn't bode well for these "pauses" in search performance. I'd appreciate any suggestions. --- Alok K. Dhir Symplicity Corporation www.symplicity.com (703) 351-0200 x 8080 adhir@symplicity.com On Oct 29, 2008, at 4:30 PM, Alok Dhir wrote: > Hi -- using solr 1.3 -- roughly 11M docs on a 64 gig 8 core machine. > > Fairly simple schema -- no large text fields, standard request > handler. 4 small facet fields. > > The index is an event log -- a primary search/retrieval requirement > is date range queries. > > A simple query without a date range subquery is ridiculously fast - > 2ms. The same query with a date range takes up to 30s (30,000ms). > > Concrete example, this query just look 18s: > > instance:client\-csm.symplicity.com AND dt:[2008-10-01T04:00:00Z TO > 2008-10-30T03:59:59Z] AND label_facet:"Added to Position" > > The exact same query without the date range took 2ms. > > I saw a thread from Apr 2008 which explains the problem being due to > too much precision on the DateField type, and the range expansion > leading to far too many elements being checked. Proposed solution > appears to be a hack where you index date fields as strings and > hacking together date functions to generate proper queries/format > results. > > Does this remain the recommended solution to this issue? > > Thanks > > --- > Alok K. Dhir > Symplicity Corporation > www.symplicity.com > (703) 351-0200 x 8080 > adhir@symplicity.com >