lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrien Grand <jpou...@gmail.com>
Subject Re: Migration to Lucene 6.5
Date Thu, 27 Jul 2017 09:40:01 GMT
DocValuesTermsQuery should not perform differently from
DocValuesTermsFilter. Maybe try to run things under a profiler and see what
it says?

Le mar. 18 juil. 2017 à 22:48, Rilpa Jain <Rilpa.Jain@tradeweb.com> a
écrit :

> Hi,
>
> We plan to migrate from lucene 5.5 to 6.5. We have been using
> DocValuesTermsFilter extensively which was deprecated in Lucene 5.5 and
> removed in Lucene 6.0.
> The Javadoc specifies to use DocValuesTermsQuery and
> BoolenaClause.Occur.Filter instead. However, as per our local tests, the
> time taken to search documents has increased with this change.
>
> Below is one of the scenarios in our application -
> We do a search within a search.
>
> (Before migration to Lucene 5.5)
> 1.      The first search is on a text field with discrete values. (There
> is no pattern to the value of this text field. Here the terms[] ranges from
> 1 to 200k in size.)  - We use DocValuesTermsFilter and pass it is as Filter
> parameter to search method.
> 2.      The second search is on result of step 1- This could be either a
> TermQuery or NumericRangeQuery, evaluated to query and added as query
> parameter to search method.
>
> (After migration to Lucene 6.5)
> 1.      The first search is on a text field with discrete values. (There
> is no pattern to the value of this text field. Here the terms[] ranges from
> 1 to 200k in size.)  - We use DocValuesTermsQuery and add it to
> BooleanQuery with Occur.Filter.
> 2.      The second search is on result of step 1- This could be either a
> TermQuery or NumericRangeQuery added to BooleanQuery with Occur.MUST.
> 3.      The booleanQuery is build and passed to search method.
>
> This query execution after migration takes 5x-10x times more as compared
> to using DocValuesTermsFilter.
>
> Is there a better class to generate query in our scenario than the one
> used above? Or is there anything that I am missing?
> Any insights would help! Thanks.
>
>
> ________________________________________________________________________
>
> The information in this email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful.
>
> Tradeweb reserves the right to monitor and review the content of all
> messages sent to or from this e-mail address. Messages sent to or from this
> e-mail address may be stored on the Tradeweb e-mail system.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message