lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doron Cohen <>
Subject Re: Integrate Lucene search facilities with existing databases
Date Thu, 24 May 2007 06:18:23 GMT

Huajing Li wrote:

> I am working on an application that must deal with ranking on highly
> metadata. For example, suppose I want to provide ranking based on the
> of downloads of hit documents. A user may log-in to the system and send a
> query, which will be answered by Lucene in a traditional way. The
> documents will be scored and ranked, based on default Lucene scoring
> functions. In addition, the system wants to support users with the
> "popularity" ranking facility. The number of downloads for a document may
> continue to increase, even during a query. It will incur much overhead if
> put the "popularity" as a field in the Lucene index (delete and insert
> document when an update happens). Instead, we choose to store such
> information in a database, with document identifiers linking database
> records back to the index.
> This setting, however, creates a ranking problem. It is not efficient to
> send each hit document identifier to the database as a SQL query to
> the download popularity information. It will be good to have the
> to link database records directly with Lucene indices, for which a query
> retrieve corresponding records from the database at the querying time. We
> are very interested to know if there is some open source toolkits or
> libraries to do the dirty-works. Of course, we also want to know other
> alternative solutions to meet our ends.

The problem seems to map to "function query" - Lucene-446.
With that patch, you could create a ValueSourceQuery, based on these
values, and then combine it into the score of another (any) query using
CustomScoreQuery. You can override the customScore() method to combine the
scores as fits your needs.

There is one catch though - that approach assumes that values obtained from
the ValueSource match Lucene's internal docids. I think this aspect is
to be an issue in any solution that combines external values into the
process. If your system does not delete documents (or if you are using the
patch from LUCENE-879 (which I didn't try yet)), and, you maintain the
by which docs are added, this may work for you.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message