From solr-user-return-144149-archive-asf-public=cust-asf.ponee.io@lucene.apache.org Fri Sep 28 20:52:44 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id B2A25180627 for ; Fri, 28 Sep 2018 20:52:43 +0200 (CEST) Received: (qmail 25707 invoked by uid 500); 28 Sep 2018 18:52:41 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 25695 invoked by uid 99); 28 Sep 2018 18:52:41 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 28 Sep 2018 18:52:41 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id C61F91A03B2 for ; Fri, 28 Sep 2018 18:52:40 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.87 X-Spam-Level: * X-Spam-Status: No, score=1.87 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, T_DKIMWL_WL_MED=-0.01, WEIRD_PORT=0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id FEKKm2Nm9ocO for ; Fri, 28 Sep 2018 18:52:39 +0000 (UTC) Received: from mail-yw1-f43.google.com (mail-yw1-f43.google.com [209.85.161.43]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id D87865F19B for ; Fri, 28 Sep 2018 18:52:38 +0000 (UTC) Received: by mail-yw1-f43.google.com with SMTP id r187-v6so2263527ywg.0 for ; Fri, 28 Sep 2018 11:52:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=oGXbBOKm52S0KyVE6LE8iGDEnkwioBjO7OIxl+Qt+W4=; b=Lb6OPFNSTCM462xBu3aAgVcjqZnvFperYgqi4tatAUhAcf1FcpwsgonyJZAVEAbKOz O549R5yuEkx4uGMndfmRqkaTY1YjfY5YF1zIwAiZwBxvzHNLSfK1yTx3WwHZF2m25Cdn kyqjlyaeMyMt03gGciBMIP7SvUzZkAWSh4PTexmOB1vlL412dfPz9Vkx3T4VZHKoyKB5 akkav2pvpByHr0iyyQcXu7rJvHdYa9ygL+VJo4nJY1aaN0Hy7bbk9GIRbZM4EkOjzOSC 7eljQ4xkGtxMXKdnNFr2AVV1rQuEe5y8zItuTMx1jy9tZC4aFZfBrF9BuACfjlguLDiz 0UcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=oGXbBOKm52S0KyVE6LE8iGDEnkwioBjO7OIxl+Qt+W4=; b=i4H6JRI9PUvkuYygq2DDDmFQY9yQlBf5U0Iv1t1N+wYJEbmX7avy/CR665oP54iWsv WEPD5oK6vgdXknJE1FTKGUZ2ABMNvGYhbXbSFqjemCJ1w8VI0tB6aOD6TWI9Kr1ZitUB Bz5WS3m0xHASVrCQ5FXBQC2/SWG24fI1YTsJDucfkJLIgpAXYh0Y6LrIkt8yxqRZpITS tl3e4LPuBTPYINkO0nGyCJNXBBl9761/Bi7NcHK2CLaCJIaca9WL2Wbb4Jw38vqu5vvE /X6yZQNMX3SNDpr+f/TgjhVcXFhV85DCeVZEcU+UJbniZNJptoSmDK8QzIsaZQFSFkLV JxEQ== X-Gm-Message-State: ABuFfojJ+Wg4xaZpfFznV0/MmhB6+rzZcKdQfhsHDcGrwn6hXFFaSj/v Vj26jOOQ3x/6PBE9OkTSWN8mKVaYvDOpc5rK+bge0LEL X-Google-Smtp-Source: ACcGV62htWYRbx7hAw7n5/ANgeUDAAv7uRzlvIfXONy7nmxz8TcHZWsDLb1+xZedf3NRhZHOCPL43BT9Hp8LaVd+/4w= X-Received: by 2002:a0d:e707:: with SMTP id q7-v6mr9005681ywe.436.1538160757581; Fri, 28 Sep 2018 11:52:37 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: RAUNAK AGRAWAL Date: Fri, 28 Sep 2018 11:52:26 -0700 Message-ID: Subject: Re: Solr Streaming Queries Performance Issues [v7.2.1] To: toes@kb.dk Cc: solr-user@lucene.apache.org Content-Type: multipart/alternative; boundary="000000000000902bef0576f2f56a" --000000000000902bef0576f2f56a Content-Type: text/plain; charset="UTF-8" Thanks a lot Toki. I will get back to you soon regarding patch update after having discussion with the team. Thanks & Regards On Fri, Sep 28, 2018 at 11:30 AM Toke Eskildsen wrote: > RAUNAK AGRAWAL wrote: > > > curl http://localhost:8983/solr/collection_name/stream -d > > 'expr=facet(collection_name,q="id:953",bucketSorts="week > > desc",buckets="week",bucketSizeLimit=200,sum(sales), > > sum(amount),sum(days))' > > Stats on numeric fields then. > > > Also in my collection, I have almost 10 Billion documents > > with many deletions (close to 40%). > > Quite a lot of documents and in this case deletions counts, as the > internal structures for the deleted documents still needs to be iterated. > In scale this looks somewhat like our 18 billion document setup, with the > addendum that we use quite large segments (900GB). > > The performance regressions we encountered with Solr 7 lead to > https://issues.apache.org/jira/browse/LUCENE-8374 which helped a lot > (performance testing has not finished). If you have or can easily create a > test server where your shard(s) is the same size as your production shards, > I'd be happy to port the patch to Solr 7.2.1 to see it it helps. I am > looking for independent verification, so it is no bother. > > > I was planning to run optimise to merge the segments but > > spoke to admin team and lucidworks guys and they were > > against it saying that it will make very large segment file. > > If your bottleneck is the same as ours, the large segment would mean worse > performance (with Solr 7). > > > Is it true that optimise in solr should not be used, as it comes with > other issues? > > No simple answer there. If you have an index that you update very rarely, > it can save memory and processing power. If you have a live index where you > add and delete documents, it will probably be a bad idea. One strategy used > with time series data is to have old and immutable data in dedicated > collections, which can then be optimized. > > - Toke Eskildsen > --000000000000902bef0576f2f56a--