lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jan Agermose" <>
Subject keyword indexing
Date Wed, 16 Jul 2003 17:03:51 GMT
I'm having some problems with chars in keywords that are not a-z0-9 chars...

If I have a keyword like "Det Naturvidenskabelige Fakultet" or a name "Jan Agermose" - well
besides the fact I need to lowercase the keywords as the querystring is lowercased by lucene,
I still cannot get any hits on the keywords. 

"Det Naturvidenskabelige Fakultet" - hits = 0
Det* - hits!
Det Naturvidenskabelige Fakultet - hits = 0

I can understand the last one - but shouldn't the first one return hits? If not, using keywords
seems to be limited to keywords composed of [a-z0-9]+ ??? 

Now I do a string replace on [^a-z0-9]+ / "" (removing all the chars) but this gives the queryparse
some problems I would think - unless in my special case where the user is not really free
to compose queries on there own - therefore I can do the same stringreplace thing on the input
:-D But I would like for the poweruser to input real queries - and this leaves me with the
problem of parsing queries. I need to do stringreplace only within double quotes... This should
be lucenes problem not mine :-D

Am I missing something ??

Jan Agermose
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message