lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Carlson <carl...@bookandhammer.com>
Subject Re: Adding a TermExpansionQuery
Date Wed, 15 May 2002 14:06:57 GMT
Hi Eric,

Thanks for the feedback. My intention was to abstract the source, but one of
my questions was, does Lucene set a configuration file which will use this
"Thesaurus" query, or will that have to be setup manually by the developer.

Currently, Lucene does not provide a configuration file.

As far as if the information is in the index directory. I was thinking this
might be a nice place for this information to exist, then it doesn't add any
other overhead to the system (i.e. No configuration file) and might be
easier to support multiple sources since the index has already been
abstracted. If you wanted to share the "Thesaurus" across many different
indices you could "copy" or "merge" that index component into the data
source. This could even be part of the build process for a file system.

--Peter

On 5/15/02 6:45 AM, "Eric D. Friedman" <eric@conveysoftware.com> wrote:

> Whichever storage mechanism you choose, you should be sure to abstract its
> interface so that people can make other choices.  With that out of the way,
> it doesn't matter too much whether you pick a properties file or an XML
> file.
> 
> That said, I wouldn't expect to find this data stored in the index
> directory, since it's not part of the index and since users may want to
> share the data across several indices.  I would also lean toward the
> XML file (for a file solution, that is -- an RDBMS should be supported
> too), since that lends itself more naturally to describing one-to-many
> relations than a properties file does.
> 
> Personal opinion: "Thesaurus" is a more descriptive term than
> "TermExpansion." To me, term expansion suggests some kind of text
> globbing, whereas a thesaurus is a reference (a "lookup table") that
> provides *semantic* expansions of the kind you describe.  Oracle's
> intermedia indexing engine has thesaurus features similar to what you
> describe and calls them by that name.


--
To unsubscribe, e-mail:   <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>


Mime
View raw message