lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eleanor Joslin <>
Subject Re: Using a QueryParser with an untokenized field?
Date Fri, 01 Feb 2008 12:46:29 GMT
Thank you, this was exactly what I needed.  So "tokenizing" really 
denotes a more general process that can involve normalizing the case or 
whatever else can be done with a filter.  This is where I was confused.


Jan Peter Stotz wrote:
> Hi Eleanor.
>> In my Lucene index there's a field that contains the local names of 
>> XML elements, one name per document.  Users can enter arbitrary 
>> queries for this field, so I'm using a QueryParser.
>> From reading around it looks as if the field needs to be tokenized, 
>> but since the field's content is always a single term, is this really 
>> necessary?  
> You are right, your field is already tokenized, but from what I know the 
> main difference is that untokenized fields do not pass your selected 
> analyzer when being added to the index. If your analyzer for example 
> incorporates the LowerCaseFilter,  the field will be converted into 
> lower case before it is indexed. When using the same analyzer for your 
> QueryParser this will allow you to perform case insensitive query.
> If you add the field untokenized and your Analyzer (at query time) 
> incorporates the LowerCaseFilter, you will be unable find elements that 
> contain upper characters.
> Jan
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

Eleanor Joslin, Software Development   DecisionSoft Ltd.
Telephone: +44-1865-203192   

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message