lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alok Dhir <ad...@symplicity.com>
Subject Re: SOLR Performance
Date Mon, 03 Nov 2008 21:58:24 GMT
I was afraid of that.  Was hoping not to need another big fat box like  
this one...

---
Alok K. Dhir
Symplicity Corporation
www.symplicity.com
(703) 351-0200 x 8080
adhir@symplicity.com

On Nov 3, 2008, at 4:53 PM, Feak, Todd wrote:

> I believe this is one of the reasons that a master/slave configuration
> comes in handy. Commits to the Master don't slow down queries on the
> Slave.
>
> -Todd
>
> -----Original Message-----
> From: Alok Dhir [mailto:adhir@symplicity.com]
> Sent: Monday, November 03, 2008 1:47 PM
> To: solr-user@lucene.apache.org
> Subject: SOLR Performance
>
> We've moved past this issue by reducing date precision -- thanks to
> all for the help.  Now we're at another problem.
>
> There is relatively constant updating of the index -- new log entries
> are pumped in from several applications continuously.  Obviously, new
> entries do not appear in searches until after a commit occurs.
>
> The problem is, issuing a commit causes searches to come to a
> screeching halt for up to 2 minutes.  We're up to around 80M docs.
> Index size is 27G.  The number of docs will soon be 800M, which
> doesn't bode well for these "pauses" in search performance.
>
> I'd appreciate any suggestions.
>
> ---
> Alok K. Dhir
> Symplicity Corporation
> www.symplicity.com
> (703) 351-0200 x 8080
> adhir@symplicity.com
>
> On Oct 29, 2008, at 4:30 PM, Alok Dhir wrote:
>
>> Hi -- using solr 1.3 -- roughly 11M docs on a 64 gig 8 core machine.
>>
>> Fairly simple schema -- no large text fields, standard request
>> handler.  4 small facet fields.
>>
>> The index is an event log -- a primary search/retrieval requirement
>> is date range queries.
>>
>> A simple query without a date range subquery is ridiculously fast -
>> 2ms.  The same query with a date range takes up to 30s (30,000ms).
>>
>> Concrete example, this query just look 18s:
>>
>> 	instance:client\-csm.symplicity.com AND dt:[2008-10-01T04:00:00Z
> TO
>> 2008-10-30T03:59:59Z] AND label_facet:"Added to Position"
>>
>> The exact same query without the date range took 2ms.
>>
>> I saw a thread from Apr 2008 which explains the problem being due to
>> too much precision on the DateField type, and the range expansion
>> leading to far too many elements being checked.  Proposed solution
>> appears to be a hack where you index date fields as strings and
>> hacking together date functions to generate proper queries/format
>> results.
>>
>> Does this remain the recommended solution to this issue?
>>
>> Thanks
>>
>> ---
>> Alok K. Dhir
>> Symplicity Corporation
>> www.symplicity.com
>> (703) 351-0200 x 8080
>> adhir@symplicity.com
>>
>
>


Mime
View raw message