lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Upayavira ...@odoko.co.uk>
Subject Re: The best way to exclude "seen" results from search queries
Date Fri, 12 Jun 2015 06:44:36 GMT
It is the number of recommendations for a single user that matter. The
more there are, the worse the performance. Try it and see is the best
way though.

I personally would have one doc per recommendation. It will reduce the
amount of churn in your index as updating a multivalued field will
involve deleting the entire document that preceded it, which will then
need merging, etc. One doc per recommendation effectively makes your
index write-only, which is much cleaner.

Regarding sharding, you can shard your original index, but a replica of
your user recommendations collection must exist on every shard/replica
of that original index. It cannot be sharded.

HTH

Upayavira

On Thu, Jun 11, 2015, at 06:06 PM, Reitzel, Charles wrote:
> So long as the fields are indexed, I think performance should be ok.
> 
> Personally, I would also look at using a single document per user with a
> multi-valued field for recommendation ID.   Assuming only a small
> fraction of all recommendation IDs are ever presented to any single user,
> this schema would be physically much smaller and require only a single
> document per user.
> 
> I don't know the answer to your sharding question.   The join query is
> available out of the box, so it should be quick work to set up a
> two-shard sample and test the distributed sub-query.
> 
> That said, with the scales you are talking about, I question if sharding
> is necessary.   You can still use replication for load balancing without
> sharding.
> 
> -----Original Message-----
> From: amid [mailto:amid@donanza.com] 
> Sent: Thursday, June 11, 2015 12:36 PM
> To: solr-user@lucene.apache.org
> Subject: RE: The best way to exclude "seen" results from search queries
> 
> Thanks allot Charles,
> 
> This seems to be what I'm looking for.
> Do you know if join for this amount of documents & user will still have
> good query performance? also, is there any limitations for the solr
> architecture once using the "join" method (i.e. sharding)?
> 
> Many thanks,
> Ami
> 
> 
> 
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/The-best-way-to-exclude-seen-results-from-search-queries-tp4211022p4211223.html
> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> *************************************************************************
> This e-mail may contain confidential or privileged information.
> If you are not the intended recipient, please notify the sender
> immediately and then delete it.
> 
> TIAA-CREF
> *************************************************************************
> 

Mime
View raw message