lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrien Grand <jpou...@gmail.com>
Subject Re: Filters Vs queries - for terms more than 1024
Date Mon, 17 Jul 2017 20:29:01 GMT
Could you use TermInSetQuery (TermsQuery in older Lucene versions)? It is
worse at skipping over matches than a BooleanQuery but keeps memory
usage low and disk access sequential, on the contrary to large boolean
queries.

Otherwise you would probably need to rethink how you design your documents
in order to be able to run simpler queries.

Le lun. 17 juil. 2017 à 16:28, Kumaran Ramasubramanian <kums.134@gmail.com>
a écrit :

> Hi All,
>
> i am using lucene 4.10.4
>
> In lucene search, i know we have 1024 limitation in number of boolean query
> clauses. i know we can increase this limit.. but i want to understand
> queries vs filter in lucene 4.10.4...
>
> i want to make queries larger than 1024.. Relevance is not needed for
> me. What are the best possible options?
>
> 1. using boolean filters is working for even 1lakh Filter Clauses in
> booleanFilter... is there any consequence using filters in this case? shall
> i proceed with this?
>
> 2. if i am giving very less memory for filters, it is managed to complete a
> search after so much GC cycles.. Why cannot we do the same for query
> clauses too? What is the actual technical reason for 1024 limitation in
> boolean query?
>
> 3. if i disable scoring process using ConstantScoreQuery, is it possible
> give more than 1024 query clauses?
>        i tried this.. But still getting java.lang.OutOfMemoryError.. Why ?
>
> java.lang.OutOfMemoryError: Java heap space
> >
> > at
> >>
> org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsEnum.<init>(Lucene41PostingsReader.java:345)
> >
> > at
> >>
> org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.docs(Lucene41PostingsReader.java:254)
> >
> > at
> >>
> org.apache.lucene.codecs.blocktree.SegmentTermsEnum.docs(SegmentTermsEnum.java:999)
> >
> > at org.apache.lucene.index.TermsEnum.docs(TermsEnum.java:149)
> >
> > at
> org.apache.lucene.search.TermQuery$TermWeight.scorer(TermQuery.java:84)
> >
> > at
> >>
> org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:356)
> >
> > at
> >>
> org.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer(ConstantScoreQuery.java:164)
> >
> > at
> >>
> org.apache.lucene.search.FilteredQuery$RandomAccessFilterStrategy.filteredScorer(FilteredQuery.java:542)
> >
> > at
> >>
> org.apache.lucene.search.FilteredQuery$FilterStrategy.filteredBulkScorer(FilteredQuery.java:504)
> >
> > at
> >>
> org.apache.lucene.search.FilteredQuery$1.bulkScorer(FilteredQuery.java:150)
> >
> >
>
>
>
> Any pointers are much appreciated... Thank you..
>
>
>
> --
> Kumaran R
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message