lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jamie <ja...@mailarchiva.com>
Subject Re: search performance
Date Mon, 02 Jun 2014 10:27:00 GMT
Tom

Thanks for the offer of assistance.

On 2014/06/02, 12:02 PM, Tincu Gabriel wrote:
> What kind of queries are you pushing into the index.
We are indexing regular emails + attachments.

Typical query is something like:
filter: to:mbox000008 from:mbox000008 cc:mbox000008 bcc:mbox000008 
deliveredto:mbox000008 sender:mbox000008 recipient:mbox000008
combined with filter query "cat:email"

We also use range queries based on date.
> Do they match a lot of documents ?
Yes, although we are using a  collector...

TopFieldCollector fieldCollector = TopFieldCollector.create(sort, 
max,true, false, false, true);

We use pagination, so only returning 1000 documents or so at a time.

>   Do you do any sorting on the result set?
Yes
>   What is the average
> document size ?
approx 100KB, We are indexing email body + attachment content.
> Do you have a lot of update traffic ?
Yes we have alot of update traffic, particularly in the environment i 
referred to. Is there a way to prioritize searching as apposed to update?

I suppose we could block all indexing while searching is on the go? Is 
there such as option in Lucene, or should we implement this?
> What kind of schema
> does your index use ?
Not sure exactly what you are referring to here. We do have alot of 
stored fields (to, from bcc, cc, etc.). The body and attachments are 
analyzed.

Regards

Jamie
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message