lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: Adding a TermExpansionQuery
Date Wed, 15 May 2002 20:58:41 GMT
This sounds like something I could use :)
I'd say keep it out of the index for various reasons that a few people
already mentioned, and Thesaurus is an easier to understand word to
non-tech, non-IR people, I think.

Otis

--- Peter Carlson <carlson@bookandhammer.com> wrote:
> Hi Eric,
> 
> Thanks for the feedback. My intention was to abstract the source, but
> one of
> my questions was, does Lucene set a configuration file which will use
> this
> "Thesaurus" query, or will that have to be setup manually by the
> developer.
> 
> Currently, Lucene does not provide a configuration file.
> 
> As far as if the information is in the index directory. I was
> thinking this
> might be a nice place for this information to exist, then it doesn't
> add any
> other overhead to the system (i.e. No configuration file) and might
> be
> easier to support multiple sources since the index has already been
> abstracted. If you wanted to share the "Thesaurus" across many
> different
> indices you could "copy" or "merge" that index component into the
> data
> source. This could even be part of the build process for a file
> system.
> 
> --Peter
> 
> On 5/15/02 6:45 AM, "Eric D. Friedman" <eric@conveysoftware.com>
> wrote:
> 
> > Whichever storage mechanism you choose, you should be sure to
> abstract its
> > interface so that people can make other choices.  With that out of
> the way,
> > it doesn't matter too much whether you pick a properties file or an
> XML
> > file.
> > 
> > That said, I wouldn't expect to find this data stored in the index
> > directory, since it's not part of the index and since users may
> want to
> > share the data across several indices.  I would also lean toward
> the
> > XML file (for a file solution, that is -- an RDBMS should be
> supported
> > too), since that lends itself more naturally to describing
> one-to-many
> > relations than a properties file does.
> > 
> > Personal opinion: "Thesaurus" is a more descriptive term than
> > "TermExpansion." To me, term expansion suggests some kind of text
> > globbing, whereas a thesaurus is a reference (a "lookup table")
> that
> > provides *semantic* expansions of the kind you describe.  Oracle's
> > intermedia indexing engine has thesaurus features similar to what
> you
> > describe and calls them by that name.
> 
> 
> --
> To unsubscribe, e-mail:  
> <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
> <mailto:lucene-dev-help@jakarta.apache.org>
> 


__________________________________________________
Do You Yahoo!?
LAUNCH - Your Yahoo! Music Experience
http://launch.yahoo.com

--
To unsubscribe, e-mail:   <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>


Mime
View raw message