lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kauler, Leto S" <leto.kau...@education.tas.gov.au>
Subject Tokenised and non-tokenised terms in one field
Date Thu, 10 Feb 2005 01:07:39 GMT
Hi all,

Seeking some "best practice" advice, or even if there is an alternative
solution.  Sorry for the email length, just trying to explain
succinctly.

Currently we add fields to our index like this (for reference, Field
booleans are STORE, INDEX, TOKENISE):

doc.add(new Field(field, value, true, true, false));
doc.add(new Field(field, value, false, true, true));

This creates two fields in a document with same name. One is stored but
not tokenised, the other which is not stored but tokenised, and both are
indexed for searchability.  The non-tokenised term is so we can do
exact-match searches.

In my mind, the terms of a title field might look like:

title
 - A Guide to Lucene (PDF)  [stored flag?]
title
 - guide
 - lucene
 - pdf

Can these be merged together in some way, and would it even make sense
to do so?  I am thinking in terms of creating a more lightweight index.

Thanks, --Leto

CONFIDENTIALITY NOTICE AND DISCLAIMER

Information in this transmission is intended only for the person(s) to whom it is addressed
and may contain privileged and/or confidential information. If you are not the intended recipient,
any disclosure, copying or dissemination of the information is unauthorised and you should
delete/destroy all copies and notify the sender. No liability is accepted for any unauthorised
use of the information contained in this transmission.

This disclaimer has been automatically added.

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message