lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki ...@getopt.org>
Subject Re: Underscore character and case issue
Date Mon, 05 Jul 2004 16:34:53 GMT
Robert Brown wrote:
> I traverse a series of files under a parent directory (similar to the 
> demo sample) and store the filename in a Document Keyword field called 
> 'Filename'.  I am using the StandardAnalyzer for both building the index 
> and searching the index.

... and here lies your problem. StandardAnalyzer lowercases the tokens, 
and strips most of the non-letters from tokens. I suggest using Luke 
(http://www.getopt.org/luke) to look inside your index, and see the 
terms as they ended up in the index, and to try out some other analyzers 
to see which is the most appropriate in your case.

-- 
Best regards,
Andrzej Bialecki

-------------------------------------------------
Software Architect, System Integration Specialist
CEN/ISSS EC Workshop, ECIMF project chair
EU FP6 E-Commerce Expert/Evaluator
-------------------------------------------------
FreeBSD developer (http://www.freebsd.org)


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message