lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oskar Berger <oskar.ber...@agent25.se>
Subject Re: StandardAnalyzer question ...
Date Mon, 20 Feb 2006 16:21:00 GMT
Hello,

Not yet an expert in the field, but as I've understood the thing the
terms are indexed as you specify them (through the filters) but the
contents are stored depending on whether you want it or not
(Filed.UnStored(), which happens to be on its way to get deprecated).

So maybe you search the lower cased but indeed get the cased as the
result in this very CASE.

/oskar 

On Mon, 2006-02-20 at 09:05 -0700, Mufaddal Khumri wrote:
> Hi,
> 
> When StandardAnalyzer is used to index documents, arent the terms, 
> amongst other things, lower cased and stored that ways in the index?
> 
> I have a index field that I index like this:
> 
> ....
> ramWriter = new IndexWriter(ramDir, standardAnalyzer, true);
> ....
> ...
> ...
> doc.add(Field.Text("categoryNames", categoryNames));
> ...
> ...
> 
> (I periodically write contents from the ram directory to the file system 
> directory.)
> 
> When I search this field via luke using the standard analyzer I find 
> words like this:
> ....
> Digital Cameras
> Digital Camera Batteries
> ....
> 
> Shouldn't the words indexed look like:
> 
> ....
> digital cameras
> digital camera batteries
> ....
> 
> If I understand this right, when using standard analyzer, shouldn't the 
> terms be indexed  in lower case?
> 
> Thanks,
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message