lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitry Kan <solrexp...@gmail.com>
Subject Re: Profiling Solr Lucene for query
Date Mon, 09 Sep 2013 11:12:08 GMT
are you querying your shards via a frontend solr? We have noticed, that
querying becomes much faster if results merging can be avoided.

Dmitry


On Sun, Sep 8, 2013 at 6:56 PM, Manuel Le Normand <
manuel.lenormand@gmail.com> wrote:

> Hello all
> Looking on the 10% slowest queries, I get very bad performances (~60 sec
> per query).
> These queries have lots of conditions on my main field (more than a
> hundred), including phrase queries and rows=1000. I do return only id's
> though.
> I can quite firmly say that this bad performance is due to slow storage
> issue (that are beyond my control for now). Despite this I want to improve
> my performances.
>
> As tought in school, I started profiling these queries and the data of ~1
> minute profile is located here:
> http://picpaste.com/pics/IMG_20130908_132441-ZyrfXeTY.1378637843.jpg
>
> Main observation: most of the time I do wait for readVInt, who's stacktrace
> (2 out of 2 thread dumps) is:
>
> catalina-exec-3870 - Thread t@6615
>  java.lang.Thread.State: RUNNABLE
>  at org.apadhe.lucene.store.DataInput.readVInt(DataInput.java:108)
>  at
>
> org.apaChe.lucene.codeosAockTreeIermsReade$FieldReader$SegmentTermsEnumFrame.loadBlock(BlockTreeTermsReader.java:
> 2357)
>  at
>
> ora.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.seekExact(BlockTreeTermsReader.java:1745)
>  at org.apadhe.lucene.index.TermContext.build(TermContext.java:95)
>  at
>
> org.apache.lucene.search.PhraseQuery$PhraseWeight.<init>(PhraseQuery.java:221)
>  at org.apache.lucene.search.PhraseQuery.createWeight(PhraseQuery.java:326)
>  at
>
> org.apache.lucene.search.BooleanQuery$BooleanWeight.<init>(BooleanQuery.java:183)
>  at
> org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:384)
>  at
>
> org.apache.lucene.searth.BooleanQuery$BooleanWeight.<init>(BooleanQuery.java:183)
>  at
> oro.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:384)
>  at
>
> org.apache.lucene.searth.BooleanQuery$BooleanWeight.<init>(BooleanQuery.java:183)
>  at
> org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:384)
>  at
>
> org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:675)
>  at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297)
>
>
> So I do actually wait for IO as expected, but I might be too many time page
> faulting while looking for the TermBlocks (tim file), ie locating the term.
> As I reindex now, would it be useful lowering down the termInterval
> (default to 128)? As the FST (tip files) are that small (few 10-100 MB) so
> there are no memory contentions, could I lower down this param to 8 for
> example? The benefit from lowering down the term interval would be to
> obligate the FST to get on memory (JVM - thanks to the NRTCachingDirectory)
> as I do not control the term dictionary file (OS caching, loads an average
> of 6% of it).
>
>
> General configs:
> solr 4.3
> 36 shards, each has few million docs
> These 36 servers (each server has 2 replicas) are running virtual, 16GB
> memory each (4GB for JVM, 12GB remain for the OS caching),  consuming 260GB
> of disk mounted for the index files.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message