From solr-user-return-139932-archive-asf-public=cust-asf.ponee.io@lucene.apache.org Fri Mar 16 17:25:03 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 7769D180608 for ; Fri, 16 Mar 2018 17:25:02 +0100 (CET) Received: (qmail 41881 invoked by uid 500); 16 Mar 2018 16:25:00 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 41862 invoked by uid 99); 16 Mar 2018 16:24:59 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 Mar 2018 16:24:59 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 248E9C018B for ; Fri, 16 Mar 2018 16:24:59 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.121 X-Spam-Level: X-Spam-Status: No, score=-0.121 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id bI7j_42iyQha for ; Fri, 16 Mar 2018 16:24:58 +0000 (UTC) Received: from mail-lf0-f46.google.com (mail-lf0-f46.google.com [209.85.215.46]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id EE72E5F1B3 for ; Fri, 16 Mar 2018 16:24:57 +0000 (UTC) Received: by mail-lf0-f46.google.com with SMTP id y19-v6so16233429lfd.4 for ; Fri, 16 Mar 2018 09:24:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=QLqi3zPb6PY0DjPI41CcHWIvDFaL/yK9MxjAZlE2/Us=; b=JheFnaZB+d25SNApDBLtvcz72rfhT8vrhiDvHpehgntgTT+l5pdNxthVL0Z4p4tkzq Bbx7RRT6equPVOUVO+4B8VTQaay5ItvUbYmIcPlZvEQ2wV6rhcm9VOgh1mE0lcc8mOf6 LWliGsQhhOfOeOCqaKj4ZTXLA3h2RrQYMURJmLhAL/wcPHajKK82poecDgjKEvEn7pVo l7pe96Cx+EtU+5HQohQasc74AR0WVCt6PWrDs1fdIaYzHbqD9j1oFSHMNVMIQbdi6o0d 1desBLNt51xfz90T+uloeH+F0IXEnTt8JbGR5qRch0k8AvOQJp5imPXVrnzreaXSxkR/ C7Qw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=QLqi3zPb6PY0DjPI41CcHWIvDFaL/yK9MxjAZlE2/Us=; b=UfYh7ubdGZvK8XkqrbaiyJWmN0yGq5DCYttGvGwNZVvXBR/TlMUY1rmz7qsvKrUc28 Sd+nqWQrA3j/BTyYw/+ulduIuifspail4F1DOcYk9o+eTccqE8oO/C5B5TZz0sYgtnMU Tm5UbFX418/04MhqK/v2fio8bFdpR1GzVC3InJSwhXXUOzsmdfvbDqkZyBAKWpEu+BUI l2mp+ks0CIxArSqh3XhrYT9XSGW3tZqT07/O5RzbfOeca/x9HryJUfZz+o5+YH59OgUn flHLTDDDVot36aGpqNjzeu8nvkcYYmqyWnjo+UvMrBsnR0hZQIZhDb+BWqxNP7Z1JOBV hfRQ== X-Gm-Message-State: AElRT7Eu2s5Uu6EhMRIGlBHoEG/ZYwmSgfqGqsqdACmUXAtDEtHpxLaV Czz/JQ24dP711s4z9BgyZR1x7vDa96PRlH17h4sISQ== X-Google-Smtp-Source: AG47ELtCvm5NmeJjPTZ4g/JvxfYJEmohepxszE6JculmGT2D3ICUbWsLnNRbi8jb/9FgKp4iMx3zy3M83DAi4Odp3TY= X-Received: by 2002:a19:8d93:: with SMTP id p141-v6mr1816950lfd.24.1521217495693; Fri, 16 Mar 2018 09:24:55 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a19:b24a:0:0:0:0:0 with HTTP; Fri, 16 Mar 2018 09:24:15 -0700 (PDT) In-Reply-To: <3b07a790-08aa-7a7d-042e-ed65b3172564@elyograg.org> References: <3b07a790-08aa-7a7d-042e-ed65b3172564@elyograg.org> From: Erick Erickson Date: Fri, 16 Mar 2018 09:24:15 -0700 Message-ID: Subject: Re: Solr document routing using composite key To: solr-user Content-Type: text/plain; charset="UTF-8" What Shawn said. 117 shards and 116 docs tells you absolutely nothing useful. I've never seen the number of docs on various shards be off by more than 2-3% when enough docs are indexed to be statistically valid. Best, Erick On Fri, Mar 16, 2018 at 5:34 AM, Shawn Heisey wrote: > On 3/6/2018 11:53 AM, Nawab Zada Asad Iqbal wrote: >> >> I have 117 shards and i tried to use document ids from zero to 116. I find >> that the distribution is very uneven, e.g., the largest bucket receives >> total 5 documents; and around 38 shards will be empty. Is it expected? > > > With such a small data set, this fits what I would expect. > > Choosing buckets by hashing (which is what compositeId does) is not perfect, > but if you send it thousands or millions of documents, it will be > *generally* balanced. > > Thanks, > Shawn >