lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <>
Subject Re: search by field, not field value
Date Mon, 13 Nov 2006 12:46:00 GMT
Glad it helped. About multiple hits, I assume you're storing values in
specific_field as UN_TOKENIZED then? or it's some unique ID? The hidden
thing here is that if your analyzer may break up your token stream for you.
It depends upon the stream and the analyzer you use. For instance, an e-mail
address (or file path or...) would be tokenized as multiple tokens (or not),
depending upon the analyzer. WhitespaceAnalyzer is probably the most
intuitive from this perspective, but UN_TOKENIZED overrides the analyzer and
doesn't break up the input stream at index time at least.

apologies if this is old hat to you,,

And if it's not, I'll recommend Luke for examining the effects of various
analyzers on input streams when putting them in your index (google lucene
and luke)....


On 11/12/06, tony yin <> wrote:
> What a valuable answer. thanks.
> don't worry about multiple hits. I would never store multiple values in
> specific_field.
> On 11/13/06, Erick Erickson <> wrote:
> ...
> --
> Kindly Regards
> Tony
> ===============================

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message