lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Carsten Schnober <schno...@ids-mannheim.de>
Subject Re: Rewrite for RegexpQuery
Date Tue, 12 Mar 2013 10:25:50 GMT
Am 12.03.2013 10:39, schrieb Uwe Schindler:

> I would suggest to use my example code with the fake query and custom rewrite. This does
not have the overhead of BooleanQuery and more important: You don't need to change the *global*
and *static* default in BooleanQuery. Otherwise you could introduce a denial of service case
into your application, if you at some other place execute a wildcard using Boolean rewrite
with unlimited number of terms.

Hi Uwe,
many thanks for your code sample! I've made tiny adaptations in
GetTermsRewrite to make the overridden methods match their counterparts
in the superclass (ScoringRewrite). I suppose that your version was not
written for Lucene 4.0, right? It looks like this now:

final class GetTermsRewrite extends ScoringRewrite<TermHolderQuery> {
    @Override
    protected void addClause(TermHolderQuery topLevel, Term term, int
docCount, float boost, TermContext states) {
      topLevel.add(term);
    }

    @Override
    protected TermHolderQuery getTopLevelQuery() {
      return new TermHolderQuery();
    }

    @Override
    protected void checkMaxClauseCount(int count) throws IOException {
        // TODO Auto-generated method stub

    }
}


I'm not sure what checkMaxClauseCount() is supposed to do though, but
apart from that, everything works great. Thanks!


The code I use for calling this:

IndexSearcher searcher = ...;
String query = ...;

MultiTermQuery query = new RegexpQuery(new Term("text", query));
query.setRewriteMethod(new GetTermsRewrite());
TermHolderQuery queryRewritten = (TermHolderQuery) searcher.rewrite(query);
Set<Term> terms = queryRewritten.getTerms();


There's another thing that is not entirely clear to me: when calling
query.setRewriteMethod(new GetTermsRewrite()), does this really apply to
the IndexSearcher in the sense that IndexSearcher.rewrite() uses the
given rewrite method? It seems to work fine, but I am not sure why it
does and whether it always will do.

Best,
Carsten


-- 
Institut für Deutsche Sprache | http://www.ids-mannheim.de
Projekt KorAP                 | http://korap.ids-mannheim.de
Tel. +49-(0)621-43740789      | schnober@ids-mannheim.de
Korpusanalyseplattform der nächsten Generation
Next Generation Corpus Analysis Platform

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message