lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Tomlinson <chris.j.tomlin...@gmail.com>
Subject How to ignore stop word gaps in queries? Lucene 4.4+
Date Thu, 10 Apr 2014 14:26:16 GMT
Hello,

We're using the Lucene 4.4 embedded in eXist-db (exist-db.org), and as the subject indicates
we want to ignore stop word gaps in queries - without the user having to indicate where such
gaps might occur at query time.

Since Lucene 4.4 the FilteringTokenFilter.setEnablePositionIncrements(false) is not available.

Prior to Lucene 4.4 it was possible setEnablePositionIncrements(false) so that during indexing
and querying the number and position of stop word gaps would be ignored.

This meant that a phrase such as:

    blue is the sky

with stop words "is" and "the" would be selected by the query:

    blue sky

We are working with Tibetan and elisions are not uncommon so that, e.g.:

    rin po che

on some occasions might be shortened to

    rin che

and we would like to have a query of

    rin po che

or

    rin che

find all occurrences of

    rin po che

and

    rin che

without having the user have to mark where elisions might occur.

The org.apache.lucene.queryparser.flexible.standard.CommonQueryParserConfiguration provides
a setEnablePositionIncrements but that does not seem to work to allow for the above desired
query behavior that was possible prior to Lucene 4.4.

What is the proper way to ignore stop word gaps?

Thank you,
Chris


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message