lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karsten Konrad" <>
Subject AW: AW: Analyzers, Queries: three questions
Date Wed, 11 Jun 2003 12:00:36 GMT


field contents indexed with Field.text are stored 
verbatim in the index - thus, you can get back the 
original text when you access it using stingValue(). 

This has nothing to do with how the text is 
indexed, i.e., how it is tokenized and stored into
the index. You probably have a token "workers" and 
one "howto", both pointing to this text (that's why 
it is called an inverted index, the words point to
the text). Your analyzer does this tokenization
for you.

If you search using the query parser, you
can only do this on indexed fields, e.g.,
those indexed with Field.text or Field.UnStored. 
If you store a text as a keyword,
you must construct a TermQuery and search
with it. Thus, you would actually get a
term ("title", "Workers HowTo").



-----Urspr√ľngliche Nachricht-----
Von: Ulrich Mayring []
Gesendet: Mittwoch, 11. Juni 2003 13:36
Betreff: Re: AW: Analyzers, Queries: three questions

Karsten Konrad wrote:
> 2) How can I pass the value of a field through an Analyzer before 
> storing it?
> A text field is automatically analyzed and tokenized by the given
> analyzer, you do not have to do it "manually".

Well, but if I browse my index I see all the terms stored in the 
original form. I use this code:

doc.add(Field.Text("title","Workers HowTo");
// Build and execute Query, so that only the above document is found
Document d = hits.doc(0);
Field field = d.get("title");
System.out.println( + "," + field.stringValue());

This outputs "title,Workers HowTo" - the untokenized, unanalyzed form.

So, what's wrong here?



To unsubscribe, e-mail:
For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message