Return-Path: Delivered-To: apmail-lucene-solr-user-archive@locus.apache.org Received: (qmail 88781 invoked from network); 3 Nov 2008 21:59:34 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 3 Nov 2008 21:59:34 -0000 Received: (qmail 88394 invoked by uid 500); 3 Nov 2008 21:59:36 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 88370 invoked by uid 500); 3 Nov 2008 21:59:36 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 88359 invoked by uid 99); 3 Nov 2008 21:59:36 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Nov 2008 13:59:36 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of adhir@symplicity.com designates 66.151.109.78 as permitted sender) Received: from [66.151.109.78] (HELO mail.symplicity.com) (66.151.109.78) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Nov 2008 21:58:17 +0000 Received: from [10.120.100.6] (unknown [10.120.100.6]) by mail.symplicity.com (Postfix) with ESMTPSA id A00BF2BE2E for ; Mon, 3 Nov 2008 16:58:24 -0500 (EST) Message-Id: From: Alok Dhir To: solr-user@lucene.apache.org In-Reply-To: <8599F2E4E80ECC44AEE81FA2974CE2BD0B1E5F07@mail-sd1.ad.soe.sony.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v929.2) Subject: Re: SOLR Performance Date: Mon, 3 Nov 2008 16:58:24 -0500 References: <8599F2E4E80ECC44AEE81FA2974CE2BD0B1E5F07@mail-sd1.ad.soe.sony.com> X-Mailer: Apple Mail (2.929.2) X-Virus-Checked: Checked by ClamAV on apache.org I was afraid of that. Was hoping not to need another big fat box like this one... --- Alok K. Dhir Symplicity Corporation www.symplicity.com (703) 351-0200 x 8080 adhir@symplicity.com On Nov 3, 2008, at 4:53 PM, Feak, Todd wrote: > I believe this is one of the reasons that a master/slave configuration > comes in handy. Commits to the Master don't slow down queries on the > Slave. > > -Todd > > -----Original Message----- > From: Alok Dhir [mailto:adhir@symplicity.com] > Sent: Monday, November 03, 2008 1:47 PM > To: solr-user@lucene.apache.org > Subject: SOLR Performance > > We've moved past this issue by reducing date precision -- thanks to > all for the help. Now we're at another problem. > > There is relatively constant updating of the index -- new log entries > are pumped in from several applications continuously. Obviously, new > entries do not appear in searches until after a commit occurs. > > The problem is, issuing a commit causes searches to come to a > screeching halt for up to 2 minutes. We're up to around 80M docs. > Index size is 27G. The number of docs will soon be 800M, which > doesn't bode well for these "pauses" in search performance. > > I'd appreciate any suggestions. > > --- > Alok K. Dhir > Symplicity Corporation > www.symplicity.com > (703) 351-0200 x 8080 > adhir@symplicity.com > > On Oct 29, 2008, at 4:30 PM, Alok Dhir wrote: > >> Hi -- using solr 1.3 -- roughly 11M docs on a 64 gig 8 core machine. >> >> Fairly simple schema -- no large text fields, standard request >> handler. 4 small facet fields. >> >> The index is an event log -- a primary search/retrieval requirement >> is date range queries. >> >> A simple query without a date range subquery is ridiculously fast - >> 2ms. The same query with a date range takes up to 30s (30,000ms). >> >> Concrete example, this query just look 18s: >> >> instance:client\-csm.symplicity.com AND dt:[2008-10-01T04:00:00Z > TO >> 2008-10-30T03:59:59Z] AND label_facet:"Added to Position" >> >> The exact same query without the date range took 2ms. >> >> I saw a thread from Apr 2008 which explains the problem being due to >> too much precision on the DateField type, and the range expansion >> leading to far too many elements being checked. Proposed solution >> appears to be a hack where you index date fields as strings and >> hacking together date functions to generate proper queries/format >> results. >> >> Does this remain the recommended solution to this issue? >> >> Thanks >> >> --- >> Alok K. Dhir >> Symplicity Corporation >> www.symplicity.com >> (703) 351-0200 x 8080 >> adhir@symplicity.com >> > >