Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 6C4C0200CB6 for ; Thu, 15 Jun 2017 01:39:16 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 6B0D6160BE8; Wed, 14 Jun 2017 23:39:16 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 89F9F160BDB for ; Thu, 15 Jun 2017 01:39:15 +0200 (CEST) Received: (qmail 68640 invoked by uid 500); 14 Jun 2017 23:39:14 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 68628 invoked by uid 99); 14 Jun 2017 23:39:14 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 Jun 2017 23:39:14 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 9EE51C01DB for ; Wed, 14 Jun 2017 23:39:13 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.897 X-Spam-Level: X-Spam-Status: No, score=-0.897 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-2.796, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id 0HdMMQeNtTim for ; Wed, 14 Jun 2017 23:39:11 +0000 (UTC) Received: from mail-wr0-f178.google.com (mail-wr0-f178.google.com [209.85.128.178]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 3D6D55F2AC for ; Wed, 14 Jun 2017 23:39:11 +0000 (UTC) Received: by mail-wr0-f178.google.com with SMTP id r103so19136630wrb.0 for ; Wed, 14 Jun 2017 16:39:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:reply-to:in-reply-to:references:from:date:message-id :subject:to; bh=jalJtFEai9hmyzIx2fmkaDliIXdzSrLtVNMEoUIVlCg=; b=saJaylc5I+6qT8V5h2a9sbFh/sMkaKI3FWb/6D4LmIp/9P6jRtv0j4VCH3PdNLdXD5 +va6S+T0pb4DPaNC3Mx59DEdMB2u4/O+isOjXZqmq7bI9w9x6JDu1TS6WmTBfR3wSZHh A9MYolGgCHuGwIxZnNfUEWtIbU1lux768j0gEgW697ctoBeeVFcGeSMmD0juRCBxlg1Y beYO8S7Ef5/hTtDws4WZZgSHS3mEJ7HTUzEc1o9jXj9xZv3gBlmz5LFRZGAHFwcxmULX 4JIZZYw5Ujnq2KEVb2jmdIzdaoljK8ijS/WrT8Na8DQLImoIB5B9GzjBrxy+qcz7tPDo Lc4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:reply-to:in-reply-to:references :from:date:message-id:subject:to; bh=jalJtFEai9hmyzIx2fmkaDliIXdzSrLtVNMEoUIVlCg=; b=YZLByKbYOPMghr8xWI9l/gUfMxb4swTxP08a5p/pblSQwBURaapkAd7kuRevzWs9fW Kvuv1l7laZUmj7ev8Ahmd2s3FJs57er9aVPh54JhZZ6F084l+4AHqt6MQlbfi3/D12Cw 5wO+Ws+EYBtluFpb8ZG3pFuLe9VNWNWTeGw5ER4oLxrl2cB0sfNJu5lP75E4VKpWvLRK S6GWNuohMitYPURQ8PUiXwT0horKufuVGhCB4SjWMJauhQg5IAhKDCW0ELK0I5V7nZef eAm86v8Bp/GOup63RN7XEjyPHMyILE1PrqPunO7UHLTJzDOuAbEuig1ormfFzXK+PoZa 4M5w== X-Gm-Message-State: AKS2vOwVPKEzuXswq7n0TNdg9wj3Gg2+MOhVbO3oWJGWTLK2/dLJR8FN RXxLzxPr9noyF96BJeyRzY+mCoAT5A== X-Received: by 10.28.111.14 with SMTP id k14mr1535184wmc.94.1497483544976; Wed, 14 Jun 2017 16:39:04 -0700 (PDT) MIME-Version: 1.0 Received: by 10.223.167.3 with HTTP; Wed, 14 Jun 2017 16:39:04 -0700 (PDT) Received: by 10.223.167.3 with HTTP; Wed, 14 Jun 2017 16:39:04 -0700 (PDT) Reply-To: lfcnassif@gmail.com In-Reply-To: References: <009201d2e532$a0089c10$e019d430$@thetaphi.de> From: =?UTF-8?Q?Lu=C3=ADs_Filipe_Nassif?= Date: Wed, 14 Jun 2017 20:39:04 -0300 Message-ID: Subject: RE: Optimizing number of segments in lucene index (no writes/deletes, only reads) To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary="001a1146a17ec166e80551f40ef2" archived-at: Wed, 14 Jun 2017 23:39:16 -0000 --001a1146a17ec166e80551f40ef2 Content-Type: text/plain; charset="UTF-8" In the past I have tried IndexSearcher with an ExecutorService to parallelize searches on multiple segments on a SSD disk. That was with Lucene 4.9. Unfortunatelly the searches became slower with various number of threads in the pool, and much slower with 1 thread. There was some overhead with that approach? It was improved with later versions? Thanks, Luis Em 14 de jun de 2017 2:21 PM, "Uwe Schindler" escreveu: Hi, This article is still very correct! Use the defaults of TieredMergePolicy, nothering more to say. The problems only start once you optimize/forceMerge for the first time and still update it afterwards. Because then your index is no longer structured in an optimal way and the huge segment will "collect" deletes and never gets merged away. So once your manually forceMerged, the Index will behave bad and you are forced to force merge over and over. So: Never ever call forceMerge for an index that is still updated, otherwise you break its structure. If you have a unmodifiable/readonly index that never ever changes and will be completely rebuilt from scratch on updates, forceMerge brings some speed improvement, but don't expect too much. BUT: You also lose the ability to parallelize searches with an Executor on IndexSearcher! Uwe ----- Uwe Schindler Achterdiek 19, D-28357 Bremen http://www.thetaphi.de eMail: uwe@thetaphi.de > -----Original Message----- > From: Riccardo Tasso [mailto:riccardo.tasso@gmail.com] > Sent: Wednesday, June 14, 2017 8:34 AM > To: Lucene Users > Subject: Re: Optimizing number of segments in lucene index (no > writes/deletes, only reads) > > Hi, > I have recently read this post, I think it will give you some hint: > > http://blog.trifork.com/2011/11/21/simon-says-optimize-is-bad-for-you/ > > Probably the only advantage of having one huge segment is to use less disk > space. > > Riccardo > > 2017-06-14 5:23 GMT+02:00 Tom Hirschfeld : > > > Hello Fellow Lucene-eers, > > > > I have a lucene 6.5.1 app primarily indexed/searched via the > > latLonDocValuesField. The index is built once, and has no writes/deletes in > > production. At indexing time, we need to select the number of segments > we > > want to generate, and it is unclear to us how many segments we should > > generate if we are optimizing for query speed. My intuition says that we > > should only generate 1 segment as we will have no writes/deletes, but I > > cannot find any hard evidence online to support or refute that hypothesis. > > Does anyone here know how many segments we should use? 1 segment? 1 > segment > > per cpu in prod? 1 segment per core in prod? Something else? > > > > Best, > > Tom Hirschfeld > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org --001a1146a17ec166e80551f40ef2--