lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raghu Ram" <raghuram.nadimi...@gmail.com>
Subject Re: Multiple Documents sharing a common boost
Date Tue, 21 Aug 2007 15:15:19 GMT
do you mean to say that we generate a compound query by AND ing the original
query with a query like

( (cluster_id=0)^boost_cluster0 OR (cluster_id=1)^boost_cluster1...) )

But is this not inefficient considering that the number of clusters is in
hundreds ??????





On 8/21/07, Erick Erickson <erickerickson@gmail.com> wrote:
>
> One solution is to keep meta-data in your index. Remember that
> documents do not all have to have the same field. So you could
> index a document with a single field
> "metadatanotafieldinanyotherdoc" that contains, say, a list of
> all of your clusters and their boosts. Read this document in at
> startup time and cache it away in your server. Thereafter, you have
> a set of boosts that can be applied at query time.
>
> Of course this useless if you wanted to boost at index time.
> But I know of no way to change the boost of a document
> without deleting and readding it with the new boost.
>
> Best
> Erick
>
> On 8/21/07, Raghu Ram <raghuram.nadiminti@gmail.com> wrote:
> >
> > Is it possible to have multiple documents share a common boost?
> >
> > An example scenario is as follows. The set of documents are clustered
> into
> > some set of clusters. Each cluster has a unique clusterId. So each
> > document
> > has a cluster Id field that associates each document with its cluster.
> > Each
> > cluster has a property called cluster score. Each document has to be
> > boosted
> > by its cluster score. The number of clusters is very small in comparison
> > to
> > the number of documents (around 100 clusters).The cluster score is
> updated
> > on a continual basis. So the cluster score cant be stored as the
> document
> > boost for each individual document as we end up updating all the
> documents
> > boost daily which seems infeasible. We are trying to find out a solution
> > that is more efficient.
> >
> > Thank you.
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message