lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Peter Stotz <jpst...@gmx.de>
Subject Re: Using a QueryParser with an untokenized field?
Date Fri, 01 Feb 2008 08:03:46 GMT
Hi Eleanor.

> In my Lucene index there's a field that contains the local names of XML 
> elements, one name per document.  Users can enter arbitrary queries for 
> this field, so I'm using a QueryParser.

> From reading around it looks as if the field needs to be tokenized, but 
> since the field's content is always a single term, is this really 
> necessary?  

You are right, your field is already tokenized, but from what I know the 
main difference is that untokenized fields do not pass your selected 
analyzer when being added to the index. If your analyzer for example 
incorporates the LowerCaseFilter,  the field will be converted into 
lower case before it is indexed. When using the same analyzer for your 
QueryParser this will allow you to perform case insensitive query.

If you add the field untokenized and your Analyzer (at query time) 
incorporates the LowerCaseFilter, you will be unable find elements that 
contain upper characters.

Jan

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message