lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Scott Smith" <ssm...@mainstreamdata.com>
Subject Queries and Filters
Date Wed, 17 Jun 2009 00:14:36 GMT
The last few versions of lucene have deprecated several of the
interfaces we were using and this is necessitating a fairly major
upgrade of our code (which hasn't had much done to it for several
years).  I'm not complaining; the changes are probably necessary.

 

In reading LIA2, I've learned about filters and realized that several of
the things we do with queries should probably be done with filters
(queries where the boost was set to 0.0 was the clue).  But I'm having
trouble making them work.

 

Currently, the code that builds the user's query calls several other
classes and "asks" if they want to add something to the query.  These
classes return a BooleanClause if they do, null if they don't.  The main
query code will add the BooleanClause to a BooleanQuery object (ignoring
null returns).  The BooleanQuery object gets passed to the
Searcher.search() method.  This works fine

 

So, I did a similar thing for filters.  That is, I created a routine
that called the classes and asks if they have a FilterClause they want
to contribute.  If I get one back, then I add it to a BooleanFilter
object.  The BooleanFilter gets passed to the Searcher.search() method
(unless no one contributes a BooleanFilter object; then the filter
passed is null).

 

The first class I tried to migrate from providing a query to providing a
filter was the class that defined the categories of information the user
was interested in (it's called "sources" in the code).  Unfortunately,
it doesn't work.  Here's a snippet from the class:

 

    public BooleanClause getQuery()

    {

        if (!_doQuery) return null;

        

        ArrayList<String> sources = _optimizedSources;

        

        BooleanQuery bquery = new BooleanQuery(); 

        for (int i = 0 ; i < sources.size() ; i++)

        {

            TermQuery tq = new TermQuery(

                new Term( LuceneField.SOURCES, 

                        sources.get(i).toLowerCase()));

            bquery.add(tq, BooleanClause.Occur.SHOULD);

        }

        bquery.setBoost(0.0f);

        

        return new BooleanClause(bquery, BooleanClause.Occur.MUST);

    }

 

    public FilterClause getFilter()

    {

        if (_doQuery) return null;

        

        ArrayList<String> sources = _optimizedSources;

        

        TermsFilter termFilter = new TermsFilter(); 

        for (int i = 0 ; i < sources.size() ; i++)

        {

            Term t = new Term(LuceneField.SOURCES, 

                        sources.get(i).toLowerCase());

            termFilter.addTerm(t);

        }

 

        return new FilterClause(termFilter, BooleanClause.Occur.MUST);

    }

 

The _doQuery simply defines whether the sources are added to the query
or to the filter object(I added this simply for debugging).  If _doQuery
is true, then everything works fine.  If it is false (meaning the
getFilters() routine return a FilterClause), then the unit tests fail
(no hits are returned).  The test case has a single source and in the
case where the filter is invoked, this FilterClause will be the only
thing added to the BooleanFilter.

 

I've been staring at this code for two days and don't see what the issue
is.  Based on my understanding, I thought these would provide the same
result.  I'm probably missing something obvious or I don't understand
filters as well as I think I do.

 

Can anyone suggest what the problem is?

 

 

 

 

 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message