lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Igor Shalyminov <ishalymi...@yandex-team.ru>
Subject Superslow search on a single 600MB index segment
Date Mon, 14 Oct 2013 16:15:39 GMT
Hello!

I'm trying to realize how I can improve search performance for my task.

The index is as follows:
- 29 segments, each of about 600 MB;
- in the complete setup, there's a thread for each segment searcher;
- index contains TermVectors with positions and payloads for word-level fields, and SortedDocValues
for document-level fields.

I perform a SpanNearQuery customized by me for payload checking (payloads are just single
int's).
I've reduced the whole search logic to iterating throught spanQuery.getSpans() and counting
the precise matched document and span numbers, and all on a single 645 MB segment (I launched
java with -Xmx4G for this particular task).
It takes 25 seconds to complete!

I tried using RAMDirectory on this index for testing purposes, but results are the same (for
now I didn't try tmpfs approach though).

Are there any ideas of how one can speed things up (possibly with delving into Lucene's internals)
keeping the completeness of query results?

-- 
Best Regards,
Igor Shalyminov

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message