lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doss <itsmed...@gmail.com>
Subject Re: Suggestion Needed: Exclude documents that are already served / viewed by a customer
Date Fri, 06 Sep 2019 08:44:10 GMT
Jorn Thanks for the input, I learned something new today!
https://cwiki.apache.org/confluence/display/solr/BloomIndexComponent this
works per segment level, but our requirement is per document level.

Thanks,
Mohandoss.

On Fri, Sep 6, 2019 at 11:41 AM Jörn Franke <jornfranke@gmail.com> wrote:

> I am not 100% sure if Solr has something out of the box, but you could
> implement a bloom filter https://en.wikipedia.org/wiki/Bloom_filter and
> store it in Solr. It is a probabilistic data structure, which is not
> growing, but can achieve your use case.
> However it has a caveat: it can, for example in your case, only say for
> sure if a person A has NOT visited person B. If you want to know if Person
> A has visited person B then there might be (with a known probability) false
> positives.
>
> Nevertheless, it still seems to address your use case as you want to show
> only not visited profiles.
>
> > Am 06.09.2019 um 07:43 schrieb Doss <itsmedoss@gmail.com>:
> >
> > Dear Experts,
> >
> > For a matchmaking portal, we have one requirement where in, if a customer
> > viewed complete details of a bride or groom then we have to exclude that
> > profile id from further search results. Currently, along with other
> details
> > we are storing the viewed profile ids in a field (multivalued field)
> > against that bride or groom's details.
> >
> > Eg., if A viewed B, then in B's document under the field saw_me we will
> add
> > A's id
> >
> > while searching, lets say, the currently searching members id is 123456
> > then we will fire a query like
> >
> > fq=-saw_me:(123456)
> >
> > Problem #1: The saw_me field value is growing like anything.
> > Problem #2: Removal of ids which are deleted from the base. Right now we
> > are doing this job as follows
> >           Query #1: fq=saw_me:(123456)&fl=DocId //Get all document ids
> > which has the deleted id as part of saw_me field.
> >           Query #2: {"DociId":"234567","saw_me":{"remove":"123456"}
> //loop
> > through the results got through the 1st query and fire the update query
> one
> > by one
> >
> > We feel that this method of handling is not that optimum, so we need
> expert
> > advice. Please guide.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message