lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Paul Sondag" <jsond...@uiuc.edu>
Subject Tokenizer
Date Mon, 30 Jul 2007 16:05:15 GMT
I have two questions.

First, Is there a tokenizer that takes every word and simply makes a token
out of it?  So it looks for two white spaces and takes the characters
between them and makes a token out of them?

If this tokenizer exists, is there a difference between doing that and
simply storing the field in the document with Field.Index = UN_TOKENIZED?

--JP

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message