lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Rochkind <rochk...@jhu.edu>
Subject Re: faceting over ngrams
Date Wed, 16 Mar 2011 16:23:45 GMT
Ah, wait, you're doing sharding?  Yeah, I am NOT doing sharding, so that 
could explain our different experiences.  It seems like sharding 
definitely has trade-offs, makes some things faster and other things 
slower. So far I've managed to avoid it, in the interest of keeping 
things simpler and easier to understand (for me, the developer/Solr 
manager), thinking that sharding is also a somewhat less mature feature.

With only 1M documents.... are you sure you need sharding at all?  You 
could still use replication to "scale out" for volume, sharding seems 
more about scaling for number of documents (or total bytes) in your 
index.  1M documents is not very large, for Solr, in general.

Jonathan

On 3/16/2011 11:51 AM, Toke Eskildsen wrote:
> On Wed, 2011-03-16 at 13:05 +0100, Dmitry Kan wrote:
>> Hello guys. We are using shard'ed solr 1.4 for heavy faceted search over the
>> trigrams field with about 1 million of entries in the result set and more
>> than 100 million of entries to facet on in the index. Currently the faceted
>> search is very slow, taking about 5 minutes per query.
> I tried creating an index with 1M documents, each with 100 unique terms
> in a field. A search for "*:*" with a facet request for the first 1M
> entries in the field took about 20 seconds for the first call and about
> 1-1½ second for each subsequent call. This was with Solr trunk. The
> complexity of my setup is no doubt a lot simpler and lighter than yours,
> but 5 minutes sounds excessive.
>
> My guess is that your performance problem is due to the merging process.
> Could you try measuring the performance of a direct request to a single
> shard? If that is satisfactory, going to the cloud would not solve your
> problem. If you really need 1M entries in your result set, you would be
> better of investigating whether your index can be in a single instance.
>

Mime
View raw message