From solr-user-return-148675-archive-asf-public=cust-asf.ponee.io@lucene.apache.org Tue Jul 2 06:14:04 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id D284118060E for ; Tue, 2 Jul 2019 08:14:03 +0200 (CEST) Received: (qmail 26660 invoked by uid 500); 2 Jul 2019 06:13:57 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 26637 invoked by uid 99); 2 Jul 2019 06:13:57 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Jul 2019 06:13:57 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 8AD68C009D for ; Tue, 2 Jul 2019 06:13:56 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.2 X-Spam-Level: X-Spam-Status: No, score=-0.2 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-he-de.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id V_m0wpU5-bqu for ; Tue, 2 Jul 2019 06:13:54 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=2a00:1450:4864:20::32d; helo=mail-wm1-x32d.google.com; envelope-from=jornfranke@gmail.com; receiver= Received: from mail-wm1-x32d.google.com (mail-wm1-x32d.google.com [IPv6:2a00:1450:4864:20::32d]) by mx1-he-de.apache.org (ASF Mail Server at mx1-he-de.apache.org) with ESMTPS id BEC5C7E208 for ; Tue, 2 Jul 2019 06:13:53 +0000 (UTC) Received: by mail-wm1-x32d.google.com with SMTP id n9so1663146wmi.0 for ; Mon, 01 Jul 2019 23:13:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:content-transfer-encoding:mime-version:date:subject:message-id :references:in-reply-to:to; bh=7wOn1i3g7CV5qGasmJShuKDYhC4jygaBRifalcUVXyQ=; b=mu86DsiObP5cf2qxXF6dy06kWH0ijbty17vDTkUD7nHbaeanbjdGsvXENjwPlVrL6E IYyRYQTQzcIQkwjhkEKVEEITlR171HHjV9STiMAdIjCQAW+0eugVJUKYPV0Bi9J/4HNW bc3Z8NuTrOAV73NUx3fjfmdor8iVzCLSGWDUKyoPu45OKKMvwOpO3vK4mOc66bo+AgYF Gsq4dGLpnMGa1ZhbAKhHsIrQ97D2j9jbEypWHOFKKyVhYn8SsrIcx3CBgJz2N96u7nw7 Rup1qMAQ8wLll6p27A8IJyGI07CnNgtudyoBcjMoG1cSyfuGucOsvi13jqKRnCEC+XZ7 /8Ew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:content-transfer-encoding:mime-version:date :subject:message-id:references:in-reply-to:to; bh=7wOn1i3g7CV5qGasmJShuKDYhC4jygaBRifalcUVXyQ=; b=JhfCLw0pXteBhaCFvZVhpESv3D4l3umt8RtUXsQjM15II9sau2U1N0ECbFp7dij7b2 oYmJ33AoHOP+y9xCi8o+XGspNCPUpk6hPsVRKR++nN4upVzzSUxQgJ/a2s/MOeebLpmJ Z0g9bPL9e0+zgTnf55emVGQIITqyk+bb6i/uls4n944LqX4qgJGAD5N6nxOZwW4kErPe Pa2yCGucZS9l21xpDDyu8Ba5FAxp5LBQzkfpQrqYTnWhykCBr1lkhKioeSFSnmvyKLMY rkfQlmBjMMCWfZe5oxyUgFoNyJYZpg5FFQ2UD1FM8queTTr8kBO3/fQpjdCWfXUn9pQm WXYQ== X-Gm-Message-State: APjAAAWItuu9Pxfc0YnhaJscEYgeDrjlCTlhHuG2xD7FK1VioBeerOSj d2QsN2Y7V5nZGPdFuE8KRcnj1oU7 X-Google-Smtp-Source: APXvYqyJ0zosbvnbFAEvAXEC8ZePE30eRYb414U+hIbyC/iii5/c3xNXKkdhttTKWHWG95h2YLUbYg== X-Received: by 2002:a1c:7d8e:: with SMTP id y136mr1936360wmc.16.1562048032866; Mon, 01 Jul 2019 23:13:52 -0700 (PDT) Received: from ?IPv6:2a02:908:1a7:1200:510a:ba8:19b:4322? ([2a02:908:1a7:1200:510a:ba8:19b:4322]) by smtp.gmail.com with ESMTPSA id s10sm1980574wmf.8.2019.07.01.23.13.51 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 01 Jul 2019 23:13:51 -0700 (PDT) From: =?utf-8?Q?J=C3=B6rn_Franke?= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (1.0) Date: Tue, 2 Jul 2019 08:13:51 +0200 Subject: Re: Configuration recommendation for SolrCloud Message-Id: References: In-Reply-To: To: solr-user@lucene.apache.org X-Mailer: iPhone Mail (16F203) As someone else wrote there are a lot of uncertainties and I recommend to te= st yourself to find the optimal configuration. Some food for thought: How many clients do you have and what is their concurrency? What operations w= ill they do? Do they Access Solr directly? You can use Jmeter to simulate th= e querying part (and also the indexing). Depending on the concurrency of use= rs you may need to think about the number of CPUs. What does moderate indexing mean? How much does the collection grow per day ?= Have you thought about putting the Zookeeper ensemble on dedicated nodes? Why do you want to use an older Solr version? Why not the newest + JDK 11? In what format are the documents? Will you convert them before ? What analys= is will you do on the documents (may have impact on index size etc)? Also important - how do you plan to reindex the full collection in case a Sc= hema field changes (hint: look that the user query aliases so this can be do= ne without interruption). Normally I would expect a web app in between also for security reasons. You m= ay need to scale this one as well. You don=E2=80=99t have to answer those questions here, but I recommend to an= swer them during a Proof-of-Concept at your premises yourself. I don=E2=80=99t see a point to create more than one cluster (except for disa= ster recovery and cross data center replication if this is needed). Maybe I a= m overlooking something here why you thought of multiple clusters. > Am 25.06.2019 um 22:53 schrieb Rahul Goswami : >=20 > Hello, > We are running Solr 7.2.1 and planning for a deployment which will grow to= > 4 billion documents over time. We have 16 nodes at disposal.I am thinking > between 3 configurations: >=20 > 1 cluster - 16 nodes > vs > 2 clusters - 8 nodes each > vs > 4 clusters -4 nodes each >=20 > Irrespective of the configuration, each node would host 8 shards (eg: a > cluster with 16 nodes would have 16*8=3D128 shards; similarly, 32 shards i= n a > 4 node cluster). These 16 nodes will be hosted across 4 beefy servers each= > with 128 GB RAM. So we can allocate 32 GB RAM (not heap space) to each > node. what configuration would be most efficient for our use case > considering moderate-heavy indexing and search load? Would also like to > know the tradeoffs involved if any. Thanks in advance! >=20 > Regards, > Rahul