lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ulrich Mayring <>
Subject Analyzers, Queries: three questions
Date Wed, 11 Jun 2003 09:50:07 GMT
Hi folks,

I'm using the Snowball analyzer to index my documents. As an example I 
took the Tomcat documentation, which includes a document with the title 
"Workers HowTo". I put this string in a field called "title", within 
which I later do my query (of course again with the same SnowballAnalyzer).

At first I indexed the field as a Keyword (== not tokenized) and Lucene 
later couldn't find it, when I searched for "Workers HowTo". I found out 
that tokenization apparently includes application of the Analyzer, so if 
I put my query through an Analyzer, then the field to search must be 
tokenized. Hence my first question:

1) How can I search untokenized fields? Do I have to pass my query 
through a "NullAnalyzer"?

Next I made the title field a Text field, so it is tokenized. Now Lucene 
finds the document, but with a low score of 0.27. Sure enough, browsing 
the index showed me that the value of the title field is stored 
"unanalyzed", i.e. "Workers HowTo" - exactly as retrieved from the 
document. On the other hand, after parsing the query, the query is 
actually transformed to "(title:worker title:howto)". This does of 
course not give an exact match, therefore I guess the low score and my 
next questions:

2) How can I pass the value of a field through an Analyzer before 
storing it?

3) How can I fine-tune my query, e.g. by saying that for searching 
within the contents fields I want to pass the query through an Analyzer, 
for searching within the title field, however, I don't want the Analyzer 
pass. And I want a Hit if either field provides it.

Currently I'm using the MultiFieldQueryParser, but that only allows one 
Analyzer for all the fields.

Thank you very much in advance for any pointers,


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message