lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: Did you mean...
Date Mon, 16 Feb 2004 19:56:56 GMT
On Feb 16, 2004, at 9:50 AM, lucene@nitwit.de wrote:
> Can somebody explain tokenStream() to me?

You are now venturing under the covers of Lucene's API.  This is where 
I give the sage advice to get the Lucene source code and surf around it 
a bit.  (It helps to have a nice IDE where you can click around classes 
and see the object hierarchy easily)

TokenStream is used by the Analyzer to split text into terms.

> TokenStream in = new WhitespaceAnalyzer().tokenStream("contents", new
> StringReader(doc.getField("contents").stringValue()));
>
> But what is the first argument (field) for tokenStream() good for? 
> Actually I
> can type whatever I want...? Don't understand the short description in 
> the
> API docs...

The field is the field name.  No built-in analyzers use it, but custom 
analyzers could key off of it to do field-specific analysis.  Look at 
the PerFieldAnalyzerWrapper to make per-field analysis easier than 
writing a custom one that keys off the field name.

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message