Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 23725172C4 for ; Fri, 9 Jan 2015 21:15:55 +0000 (UTC) Received: (qmail 75584 invoked by uid 500); 9 Jan 2015 21:15:54 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 75520 invoked by uid 500); 9 Jan 2015 21:15:54 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 75509 invoked by uid 99); 9 Jan 2015 21:15:54 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Jan 2015 21:15:54 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW X-Spam-Check-By: apache.org Received-SPF: error (nike.apache.org: local policy) Received: from [209.85.212.173] (HELO mail-wi0-f173.google.com) (209.85.212.173) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Jan 2015 21:15:29 +0000 Received: by mail-wi0-f173.google.com with SMTP id r20so4779912wiv.0 for ; Fri, 09 Jan 2015 13:15:08 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:date:message-id:subject:from:to :content-type; bh=YIZp3DBMIX7hAA98YRsMo1leIdqps3NMMyCDKUyXNJo=; b=KYKlo45H+LG/9sckEdLVX5jaRyXGOSfNltKcvUcV/QjbQvkd0NPaunZJ5eUA6bhhwF AQTesvlKumBF5ZlqDorvSO3dloDxDzDOADS+uJzE/hplKc3zoz19qqLILaB+9EL0RMl9 yA5CG0WnrYIzXvphfhH+xgBtdVaqQXTmKPsWF82lV2BzPd0iiWF9pRY6jYRlRX9dolT0 cOXkOdOZH4fbFWcWeCTrrKhRVWR7MMkSEzrUicSgWWLWQVIv54ea73vQj9EHTWtpPUCS 4Blg5s097vIoDOaevUx524njmoCEEGjPAT5V+K1B1/SVLL+gp4TlcCQ4nPl4O3yrfxEM 32kA== X-Gm-Message-State: ALoCoQla+ZQXSIyuFrQjoDyuvg8hIlk8ZAd2Z1QVkNnXjV9onYEfxrWqfkaB+9+/8tvs0G+hegbE MIME-Version: 1.0 X-Received: by 10.180.206.47 with SMTP id ll15mr8952423wic.34.1420838107889; Fri, 09 Jan 2015 13:15:07 -0800 (PST) Received: by 10.194.235.135 with HTTP; Fri, 9 Jan 2015 13:15:07 -0800 (PST) Date: Fri, 9 Jan 2015 16:15:07 -0500 Message-ID: Subject: Details on setting block parameters for Lucene41PostingsFormat From: Tom Burton-West To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary=001a11c25c20b45003050c3ea6d2 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c25c20b45003050c3ea6d2 Content-Type: text/plain; charset=UTF-8 Hello all, We have over 3 billion unique terms in our indexes and with Solr 3.x we set the TermIndexInterval to about 8 times its default value in order to index without OOMs. ( http://www.hathitrust.org/blogs/large-scale-search/too-many-words-again) We are now working with Solr 4 and running into memory issues and are wondering if we need to do something analogous for Solr 4. The javadoc for IndexWriterConfig ( http://lucene.apache.org/core/4_10_2/core/org/apache/lucene/index/IndexWriterConfig.html#setTermIndexInterval%28int%29 ) indicates that the lucene 4.1 postings format has some parameters which may be set: "..To configure its parameters (the minimum and maximum size for a block), you would instead use Lucene41PostingsFormat.Lucene41PostingsFormat(int, int) " Is there documentation or discussion somewhere about how to determine appropriate parameters or some detail about what setting the maxBlockSize and minBlockSize does? Tom Burton-West http://www.hathitrust.org/blogs/large-scale-search --001a11c25c20b45003050c3ea6d2--