From solr-user-return-148507-archive-asf-public=cust-asf.ponee.io@lucene.apache.org  Thu Jun 20 20:06:12 2019
Return-Path: <solr-user-return-148507-archive-asf-public=cust-asf.ponee.io@lucene.apache.org>
X-Original-To: archive-asf-public@cust-asf.ponee.io
Delivered-To: archive-asf-public@cust-asf.ponee.io
Received: from mail.apache.org (hermes.apache.org [207.244.88.153])
	by mx-eu-01.ponee.io (Postfix) with SMTP id 2F2A4180670
	for <archive-asf-public@cust-asf.ponee.io>; Thu, 20 Jun 2019 22:06:12 +0200 (CEST)
Received: (qmail 53053 invoked by uid 500); 20 Jun 2019 20:06:04 -0000
Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm
Precedence: bulk
List-Help: <mailto:solr-user-help@lucene.apache.org>
List-Unsubscribe: <mailto:solr-user-unsubscribe@lucene.apache.org>
List-Post: <mailto:solr-user@lucene.apache.org>
List-Id: <solr-user.lucene.apache.org>
Reply-To: solr-user@lucene.apache.org
Delivered-To: mailing list solr-user@lucene.apache.org
Received: (qmail 53042 invoked by uid 99); 20 Jun 2019 20:06:04 -0000
Received: from Unknown (HELO mailrelay1-lw-us.apache.org) (10.10.3.159)
    by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 20 Jun 2019 20:06:04 +0000
Received: from mail-wm1-f42.google.com (mail-wm1-f42.google.com [209.85.128.42])
	by mailrelay1-lw-us.apache.org (ASF Mail Server at mailrelay1-lw-us.apache.org) with ESMTPSA id 1B5831C84
	for <solr-user@lucene.apache.org>; Thu, 20 Jun 2019 20:06:01 +0000 (UTC)
Received: by mail-wm1-f42.google.com with SMTP id x15so4236194wmj.3
        for <solr-user@lucene.apache.org>; Thu, 20 Jun 2019 13:06:01 -0700 (PDT)
X-Gm-Message-State: APjAAAWM6xVfgkiBjmfCmhiDfZkgdSvghWK7/Ib0Ieip1BfVQZdkXsYE
	th+kQF09C4fxuTFZ0OQtKNHLbmKh+tUW5/Kavt4=
X-Google-Smtp-Source: APXvYqyyMuU0EN/1yGduPTyWEEUmU0qAIJSNYaZEPXdai/g35dg8QhKvdKTpjHjh4Y1RJJ8p5b75NFGi2jgiB6IrpX4=
X-Received: by 2002:a05:600c:2549:: with SMTP id e9mr860931wma.46.1561061160165;
 Thu, 20 Jun 2019 13:06:00 -0700 (PDT)
MIME-Version: 1.0
References: <CAFQSS+MCL9dN_XrZsxoLwVS8Q=F0UW0ANSS3Y-dOdSXzBxX7dQ@mail.gmail.com>
In-Reply-To: <CAFQSS+MCL9dN_XrZsxoLwVS8Q=F0UW0ANSS3Y-dOdSXzBxX7dQ@mail.gmail.com>
From: Mikhail Khludnev <mkhl@apache.org>
Date: Thu, 20 Jun 2019 23:05:49 +0300
X-Gmail-Original-Message-ID: <CAF8TkC45Unn+gu1Dzc-mqWxs+L7NX=uXTMJd5A7sENz9exAJjg@mail.gmail.com>
Message-ID: <CAF8TkC45Unn+gu1Dzc-mqWxs+L7NX=uXTMJd5A7sENz9exAJjg@mail.gmail.com>
Subject: Re: Large Data set relationships handling
To: solr-user <solr-user@lucene.apache.org>
Content-Type: multipart/alternative; boundary="000000000000ecaf2f058bc6df37"

--000000000000ecaf2f058bc6df37
Content-Type: text/plain; charset="UTF-8"

On Thu, Jun 20, 2019 at 5:47 PM Lucky Sharma <goku0910@gmail.com> wrote:

> Hi all,
> Needed help in  one use case :
> It is like when you have  2 sets of data suppose A and B, which are
> linked to each other. For example, each entity of set X can have 1 to
> many relationships to the set B, and as a result, I need the
> sorted/faceted values of the values from Set B.
> For example entity x(i) from Set A, can have a relation which all the
> values in the Set B. and another entity x(j) from Set A can have
> [y(i)... y(j)] values from set B.
>
>
> * both the data sets are too larger.
>
> One Idea was too just have data of Set B, and we just put fq for all
> the values of which Set X can have and then we can do sort and
> faceting on them.
> but since the data size is +1000 it will never be a good approach.
>
1. this is what "lucene join" does underneath. It's enabled by score=none
see
https://lucene.apache.org/solr/guide/7_2/other-parsers.html#OtherParsers-JoinQueryParser
2. this requires proper sharding, linked data should reside the same shard,
otherwise - no way.
3. note, when you say fq with all values, hopefully it might be achieved
with {!terms} qp, which way more powerful than bare {!lucene}'s bq.
4. the set notation above confuses me a little, it might seem many-to-many
indeed.


>
> Another Idea is we can create a parent-child data relationship as 2
> different collections and then perform join over them,
>

Query-time join can't handle two sharded collection, although there some
plugins and patches claiming so.
 Index time join aka Block join or {!parent} requires docs to be
collocated.


>
> Please review and suggest if there could be any other way possible of
> solving this problem.
>
>
>
> --
> Warm Regards,
>
> Lucky Sharma
> Contact No: +91 9821559918
>


-- 
Sincerely yours
Mikhail Khludnev

--000000000000ecaf2f058bc6df37--