lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koji Sekiguchi <k...@r.email.ne.jp>
Subject Re: How do TeeTokenizer and SinkTokenizer work?
Date Sat, 23 Aug 2008 08:45:32 GMT
Hi Kurosaka-san,

I'd written an article on my blog several month ago about SinkTokenizer
and TeeTokenFilter.

See:
http://lucene.jugem.jp/?eid=172

Sorry, but all written in Japanese...

Koji


Teruhiko Kurosaka wrote:
> Hello,
> I'm interested in knowing how these tokenizers work together.
> The API doc for TeeTokenizer
> http://lucene.apache.org/java/2_3_1/api/org/apache/lucene/analysis/TeeTokenFilter.html
>
> has this sample code:
> SinkTokenizer sink1 = new SinkTokenizer(null);
> SinkTokenizer sink2 = new SinkTokenizer(null);
>
> TokenStream source1 = new TeeTokenFilter(new TeeTokenFilter(new WhitespaceTokenizer(reader1),
sink1), sink2);
> TokenStream source2 = new TeeTokenFilter(new TeeTokenFilter(new WhitespaceTokenizer(reader2),
sink1), sink2);
>
> TokenStream final3 = new EntityDetect(sink1);
> TokenStream final4 = new URLDetect(sink2);
>
> with an explanation that reads "sink1 and sink2 will both get tokens from both reader1
and reader2 after whitespace tokenizer",
> but I don't understand how the input from reader1 and reader2 are mixed together.
> Will sink1 first reaturn the reader1 text, and reader2?
> Or are they mixed randomly?
>
> -Kuro
>  
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>   


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message