lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexandre Rafalovitch <arafa...@gmail.com>
Subject Re: Lowering query time
Date Wed, 05 Feb 2014 01:11:40 GMT
I suspect faceting is the issue here. The actual query you have shown
seem to bring back a single document (or a single set of document for
a product):
fq=id:(320403401)

On the other hand, you are asking for 4 field facets:
facet.field=q_virtualCategory_ss
facet.field=q_brand_s
facet.field=q_color_s
facet.field=q_category_ss
AND 2 range facets, both clustered/grouped:
facet.range=daysSinceStart_i
facet.range=activePrice_l (e.g. f.activePrice_l.facet.range.gap=5000)

And for all facets you have asked to bring back ALL of the results:
facet.limit=-1

Plus, you are doing a complex sort:
sort=popularity_i desc,popularity_i desc

So, you are probably spending quite a bit of time counting (especially
in a shared setup) and then quite a bit more sending the response
back.

I would check the size of the result document (HTTP result) and see
how large it is. Maybe you don't need all of the stuff that's coming
back. I assume you are not actually querying Solr from the client's
machine (that is I hope it is inside your data centre close to your
web server), otherwise I would say to look at automatic content
compression as well to minimize on-wire document size.

Finally, if your documents have many stored fields (store=true in
schema.xml) but you only return small subsets of them during search,
you could look into using enableLazyFieldLoading flag in the
solrconfig.

Regards,
   Alex.
P.s. As others said, you don't seem to have too many documents.
Perhaps you want replication instead of sharding for improved
performance.
Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Wed, Feb 5, 2014 at 6:31 AM, Alexey Kozhemiakin
<Alexey_Kozhemiakin@epam.com> wrote:
> Btw "timing" for distributed requests are broken at this moment, it doesn't combine values
from requests to shards.  I'm working on a patch.
>
> https://issues.apache.org/jira/browse/SOLR-3644
>
> -----Original Message-----
> From: Jack Krupansky [mailto:jack@basetechnology.com]
> Sent: Tuesday, February 04, 2014 22:00
> To: solr-user@lucene.apache.org
> Subject: Re: Lowering query time
>
> Add the debug=true parameter to some test queries and look at the "timing"
> section to see which search components are taking the time. Traditionally, highlighting
for large documents was a top culprit.
>
> Are you returning a lot of data or field values? Sometimes reducing the amount of data
processed can help. Any multivalued fields with lots of values?
>
> -- Jack Krupansky
>
> -----Original Message-----
> From: Joel Cohen
> Sent: Tuesday, February 4, 2014 1:43 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Lowering query time
>
> 1. We are faceting. I'm not a developer so I'm not quite sure how we're doing it. How
can I measure?
> 2. I'm not sure how we'd force this kind of document partitioning. I can see how my shards
are partitioned by looking at the clusterstate.json from Zookeeper, but I don't have a clue
on how to get documents into specific shards.
>
> Would I be better off with fewer shards given the small size of my indexes?
>
>
> On Tue, Feb 4, 2014 at 12:32 PM, Yonik Seeley <yonik@heliosearch.com> wrote:
>
>> On Tue, Feb 4, 2014 at 12:12 PM, Joel Cohen <joel.cohen@bluefly.com>
>> wrote:
>> > I'm trying to get the query time down to ~15 msec. Anyone have any
>> > tuning recommendations?
>>
>> I guess it depends on what the slowest part of the query currently is.
>>  If you are faceting, it's often that.
>> Also, it's often a big win if you can somehow partition documents such
>> that requests can normally be serviced from a single shard.
>>
>> -Yonik
>> http://heliosearch.org - native off-heap filters and fieldcache for
>> solr
>>
>
>
>
> --
>
> joel cohen, senior system engineer
>
> e joel.cohen@bluefly.com p 212.944.8000 x276 bluefly, inc. 42 w. 39th st. new york, ny
10018 www.bluefly.com <http://www.bluefly.com/?referer=autosig> | *fly since
> 2013...*
>

Mime
View raw message