lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter A. Kirk" ...@alpha-solutions.dk>
Subject RE: Synonyms from Database
Date Mon, 11 Jan 2010 09:51:19 GMT
Hi - I don't think you'll see a "performance hit" using a DB for your synonym configuration
as opposed to a text file. 

The configuration is only done once (at startup) - or when you "reload". You won't be reloading
every minute, will you? After reading the configuration, the synonyms are available to Solr
via the SynonymFilter object (at least as I understand it from looking at the code).

The reload feature actually sounds quite neat - it will reload "in the background", and "switch
in" the newly read configuration when it's ready - so hopefully no down-time waiting for configuration.

Med venlig hilsen / Best regards

Peter Kirk
E-mail: mailto:pk@alpha-solutions.dk


-----Original Message-----
From: Ravi Gidwani [mailto:ravi.gidwani@gmail.com] 
Sent: 11. januar 2010 22:43
To: solr-user@lucene.apache.org; noble.paul@gmail.com
Subject: Re: Synonyms from Database

Thanks all for your replies.

I guess what I meant by Query time, and as I understand solr  (and I may be
wrong here) I can add synonyms.txt in the query analyser as follows:

      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
       ....
     </analyzer>

By this my understanding is , even if the document (at index time) has a
word "mathematics" and my synonyms.txt file has:

mathematics=>math,maths,

a query for "math" will match "mathematics". Since we have the synonyms.txt
in the query analyzer. So I was curious about the database approach on
similar lines.

I get the point of the performance, and I think that is a big NO NO for this
approach. But the idea was to allow changing the synonyms on the fly (more
like adaptive synonyms) and improve the hits.

I guess the only way (as Otis suggested) is to rewrite the file and reload
configuration (as Peter suggested). This might be a performance hit (rewrite
the file) and reload, but I guess still much better than the reading from DB
?

Thanks again for your comments.

~Ravi.


2010/1/10 Noble Paul നോബിള്‍ नोब्ळ् <noble.paul@corp.aol.com>

> On Sun, Jan 10, 2010 at 1:04 PM, Otis Gospodnetic
> <otis_gospodnetic@yahoo.com> wrote:
> > Ravi,
> >
> > I think if your synonyms were in a DB, it would be trivial to
> periodically dump them into a text file Solr expects.  You wouldn't want to
> hit the DB to look up synonyms at query time...
> Why query time. Can it not be done at startup time ?
> >
> >
> > Otis
> > --
> > Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
> >
> >
> >
> > ----- Original Message ----
> >> From: Ravi Gidwani <ravi.gidwani@gmail.com>
> >> To: solr-user@lucene.apache.org
> >> Sent: Sat, January 9, 2010 10:20:18 PM
> >> Subject: Synonyms from Database
> >>
> >> Hi :
> >>      Is there any work done in providing synonyms from a database
> instead of
> >> synonyms.txt file ? Idea is to have a dictionary in DB that can be
> enhanced
> >> on the fly in the application. This can then be used at query time to
> check
> >> for synonyms.
> >>
> >> I know I am not putting thoughts to the performance implications of this
> >> approach, but will love to hear about others thoughts.
> >>
> >> ~Ravi.
> >
> >
>
>
>
> --
> -----------------------------------------------------
> Noble Paul | Systems Architect| AOL | http://aol.com
>

No virus found in this incoming message.
Checked by AVG - www.avg.com 
Version: 9.0.725 / Virus Database: 270.14.133/2612 - Release Date: 01/11/10 08:35:00
Mime
View raw message