lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Posselt Vestergaard" <ppv_milest...@hotmail.com>
Subject analyzer effecting phrases?
Date Mon, 20 Dec 2004 14:24:07 GMT
Hi
I am building an index of texts, each related to a unique id. The unique ids
might contain a number of underscores which will make the standardanalyzer
shorten them after it sees the second underscore in a row. Furthermore many
of the texts I am indexing is in Italian so the removal of 'trivial' words
done by the standard analyzer is not necessarily meaningful for these texts.
Therefore I am instead using an analyzer made from the WhitespaceTokenizer
and the LowerCaseFilter.
This works fine for me until I try searching for a phrase. I am searching
for a simple phrase containing two words and with double-quotes around it. I
have found the phrase in one of the texts so I know it should return at
least one result, but none is found. If I remove the double-quotes and
searches for the 2 words with AND between them I do find the story.
Can anyone tell me if this is an obvious (side-)effect of not using the
standard analyzer? And is there a better solution to my problem than using
the very simple analyzer?
Best regards
Peter Vestergaard
PS: I use the same analyzer for both searching and indexing (of course).

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message