lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Lea <ian....@gmail.com>
Subject Re: [Lucene] custom Query, and Stop Words
Date Wed, 09 Feb 2011 09:58:46 GMT
Have you considered using stemming instead?  Sounds like that might
make most of your problems go away and achieve the same result.

I'm not aware of a utility method to remove stop words from a string
but there are ways of passing data through analyzers/tokenizers and
grabbing the output. StandardAnalyzer stores the stop words in
STOP_WORDS_SET and you might be able to use that directly, or pass in
your own set of stop works which you would obviously have access to.

If you stick with your split/word by word approach, watch out for
punctuation.and Mixed Case and other complications.


--
Ian.

On Wed, Feb 9, 2011 at 8:43 AM, sol myr <solmyr72@yahoo.com> wrote:
> Hi,
>
> I'm building my own BooleanQuery (rather than using Query Parser). That's because I need
different defaults from my users:
> If a user types:  java program
> I need to run the query: +java* +program* (namely AND search, with Prefix so as to hit
"programS", "programMER").
>
> So naively I split the user's input into words, and build my query term-by-term:
>    String[] words= userInput.split(" ");
>    BooleanQuery query=new BooleanQuery();
>    for(String word: words)
>        query.add(new PrefixQuery(new Term("topic", word)), Occur.MUST);
>
> But since my index is built with StandardAnalyzer, I'm having trouble with Stop Words.
> If the user types: program of java
> Then of course my query (+program* +of* +java+) returns zero results.
>
> Is there an easy solution?
> I'd like to keep using StandardAnalyzer (I don't want to index by stupid keywords such
as "of"). I would just like stop-words to be removed from my query (as QueryParser does).
> Is there some utility method for this? Direct access to the list of stop-words?
>
> Thanks
>
>
>
>
> ____________________________________________________________________________________
> Bored stiff? Loosen up...
> Download and play hundreds of games for free on Yahoo! Games.
> http://games.yahoo.com/games/front

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message