lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrien Grand <jpou...@gmail.com>
Subject Re: Replacement for Filter-as-abstract-class in Lucene 5.4?
Date Tue, 17 Jan 2017 19:07:11 GMT
Le jeu. 12 janv. 2017 à 00:31, Trejkaz <trejkaz@trypticon.org> a écrit :

> In the future now, looking at Lucene 6.3 Javadocs, where Filter is now
> gone, and it seems that ConstantScoreWeight is still @lucene.internal
> (and awfully hard to understand how it can do much at all...). Did we
> ever get a replacement class for this use case for Filter? I read
> something about solr taking a copy of the class over in its code,
> which might be what we have to do here, but I wanted to check first.
>

We are open to feedback, what issues are you having with
ConstantScoreWeight? It is true that it does not bring much compared to
Weight anymore now that we removed query normalization. The only useful
thing it has is the default explain() implementation.

Regarding Filter, it was moved to Solr because Solr has strong interactions
between the Filter class and its filter cache, which were not easy to fix
at the same time as we wanted to remove Filter from Lucene. It should go
away eventually hopefully. One thing that made filters nice in the past is
that you could decide to optionally implement some Bits, but this is now
supported in queries too in a more flexible way (but harder to implement I
agree) through two-phase iteration: you can provide an approximation in
addition to the ability to check whether a document matches. Also, the
match cost API helps execute the costly bits last, which used to be an
issue with filters. For instance, a bitset-based filter should be
intersected in a leap-frog fashion with the query while a doc-values-based
filter should be executed last in a random-access fashion but the filter
API had no way to propagate that information. Solr initially added ways to
optimize it through the ExtendedQuery/PostFilter classes which allowed to
do filtering at the collector level but this should now be replaced with
two-phase iteration too in my opinion.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message