lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Brown <rbr...@redbaritone.com>
Subject Underscore character and case issue
Date Mon, 05 Jul 2004 16:21:58 GMT
I traverse a series of files under a parent directory (similar to the 
demo sample) and store the filename in a Document Keyword field called 
'Filename'.  I am using the StandardAnalyzer for both building the index 
and searching the index.

I have two things I am trying to understand:

1. A search does not find files if they contain capitalization.  If I 
search for a known file in the index (N3151.txt) with a search string of 
'N3151.txt' it is not found.  As a workaround, I am storing the filename 
in a different "Unstored" field and converting the filename to lowercase 
for the Filename field.  It behaves like the search filtered to 
lowercase but the index did not.  Do I have to explicitly use 
.toLowerCase() on a field during indexing or am I building my index 
incorrectly?

2. A few of my filenames have underscores in them (e.g. readme_v32.txt) 
and I am having a hard time finding API documentation that relates to 
this character.  I cannot find a filename when typing the name exactly 
but am able to see it with a wildcard search string (e.g. readme*.txt or 
r*v32.txt).  What do I do to handle the underscore?  Is this a weight 
problem, something to do with the QueryParserConstants, or something 
entirely different?

Thanks for your help out there!

				R


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message