lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikhail Khludnev <>
Subject Re: Large Data set relationships handling
Date Thu, 20 Jun 2019 20:05:49 GMT
On Thu, Jun 20, 2019 at 5:47 PM Lucky Sharma <> wrote:

> Hi all,
> Needed help in  one use case :
> It is like when you have  2 sets of data suppose A and B, which are
> linked to each other. For example, each entity of set X can have 1 to
> many relationships to the set B, and as a result, I need the
> sorted/faceted values of the values from Set B.
> For example entity x(i) from Set A, can have a relation which all the
> values in the Set B. and another entity x(j) from Set A can have
> [y(i)... y(j)] values from set B.
> * both the data sets are too larger.
> One Idea was too just have data of Set B, and we just put fq for all
> the values of which Set X can have and then we can do sort and
> faceting on them.
> but since the data size is +1000 it will never be a good approach.
1. this is what "lucene join" does underneath. It's enabled by score=none
2. this requires proper sharding, linked data should reside the same shard,
otherwise - no way.
3. note, when you say fq with all values, hopefully it might be achieved
with {!terms} qp, which way more powerful than bare {!lucene}'s bq.
4. the set notation above confuses me a little, it might seem many-to-many

> Another Idea is we can create a parent-child data relationship as 2
> different collections and then perform join over them,

Query-time join can't handle two sharded collection, although there some
plugins and patches claiming so.
 Index time join aka Block join or {!parent} requires docs to be

> Please review and suggest if there could be any other way possible of
> solving this problem.
> --
> Warm Regards,
> Lucky Sharma
> Contact No: +91 9821559918

Sincerely yours
Mikhail Khludnev

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message