Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7DE4E10D0B for ; Wed, 9 Oct 2013 23:14:26 +0000 (UTC) Received: (qmail 81377 invoked by uid 500); 9 Oct 2013 23:14:23 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 81341 invoked by uid 500); 9 Oct 2013 23:14:23 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 81333 invoked by uid 99); 9 Oct 2013 23:14:23 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Oct 2013 23:14:23 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of benson@basistech.com designates 209.85.219.52 as permitted sender) Received: from [209.85.219.52] (HELO mail-oa0-f52.google.com) (209.85.219.52) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Oct 2013 23:14:17 +0000 Received: by mail-oa0-f52.google.com with SMTP id n2so848919oag.25 for ; Wed, 09 Oct 2013 16:13:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=basistech.com; s=google; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=e9awTbenkLkzpnQ81AZeoLGu4+u0zseqZiasoi4j4i0=; b=rUjs7VX132TiOyDv+Ig1YiGGmxhtz+yH2M9EZxxrnGIEqyv/ExTjkjkf3nfcfRQvPb 4yfdco+pEJo1e/lvlfC5r736QeCQwkTWRxUSUUI49S67MeJXe2hPtQ/XtEvJNTw15WrO TP9GIm1ewLBt34l/124A+GG3xt250ai360o4M= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=e9awTbenkLkzpnQ81AZeoLGu4+u0zseqZiasoi4j4i0=; b=gnyn/cWyyRJZZ8FOoQbFA+KhWDiLcjb72ok16aWrzPbJsVpg+7ibkRdc72KTzorIIS wm9LGw3Pga1CvZZmAmJEuvB06j2a32oZUmC9/F5p8m2GuZKTdVoUTuuou4vTUZmzsZbd 3IoqsI80a/5bTXj/kz/p1Ccka3LH+IObLiRlbgJ4s89GmXCzWKVRt01UhB/JqWALns9Y ermXAREJ863Znx3oeeK6L0qcATLR4/3DRTZJEG7QKpJqGLEkID8ACo+qgCVXmNHW/Ve/ D5Hrzr8SHG4+fv7NhS0Nr44nSdcbK6xVn69d49bG0OptM9uF1JXVEjiZLHEFDCKV2ShX jqoQ== X-Gm-Message-State: ALoCoQlwY4VWNNindCjZfsSKajcBtwQ7rgQAgejGeykdQLF97+P1pcEz5bqyeQGcirtMhNmpaAKX MIME-Version: 1.0 X-Received: by 10.182.137.231 with SMTP id ql7mr9312obb.75.1381360436071; Wed, 09 Oct 2013 16:13:56 -0700 (PDT) Received: by 10.76.12.7 with HTTP; Wed, 9 Oct 2013 16:13:55 -0700 (PDT) In-Reply-To: References: Date: Wed, 9 Oct 2013 19:13:55 -0400 Message-ID: Subject: Re: Exploiting a whole lot of memory From: Benson Margulies To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary=001a1130cdee19679f04e8570a26 X-Virus-Checked: Checked by ClamAV on apache.org --001a1130cdee19679f04e8570a26 Content-Type: text/plain; charset=UTF-8 On Tue, Oct 8, 2013 at 5:50 PM, Michael McCandless < lucene@mikemccandless.com> wrote: > DirectPostingsFormat? > > It stores all terms + postings as simple java arrays, uncompressed. > This definitely speeded things up in my benchmark, but I'm greedy for more. I just made a codec that returns it as the postings guy, is that the whole recipe?. Does it make sense to extend it any further to any of the other codec pieces? > > Mike McCandless > > http://blog.mikemccandless.com > > > On Tue, Oct 8, 2013 at 5:45 PM, Benson Margulies > wrote: > > Consider a Lucene index consisting of 10m documents with a total disk > > footprint of 3G. Consider an application that treats this index as > > read-only, and runs very complex queries over it. Queries with many > terms, > > some of them 'fuzzy' and 'should' terms and a dismax. And, finally, > > consider doing all this on a box with over 100G of physical memory, some > > cores, and nothing else to do with its time. > > > > I should probably just stop here and see what thoughts come back, but > I'll > > go out on a limb and type the word 'codec'. The MMapDirectory, of course, > > cheerfully gets to keep every single bit in memory. And then each query > > runs, exercising the the codec, building up a flurry of Java objects, > all > > of which turn into garbage and we start all over. So, I find myself > > wondering, is there some sort of an opportunity for a codec-that-caches > in > > here? In other words, I'd like to sell some of my space to buy some time. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --001a1130cdee19679f04e8570a26--