Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6092AD24F for ; Thu, 8 Nov 2012 10:32:42 +0000 (UTC) Received: (qmail 82702 invoked by uid 500); 8 Nov 2012 10:32:41 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 82640 invoked by uid 500); 8 Nov 2012 10:32:41 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 82632 invoked by uid 99); 8 Nov 2012 10:32:40 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Nov 2012 10:32:40 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of simon.willnauer@gmail.com designates 209.85.219.48 as permitted sender) Received: from [209.85.219.48] (HELO mail-oa0-f48.google.com) (209.85.219.48) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Nov 2012 10:32:35 +0000 Received: by mail-oa0-f48.google.com with SMTP id h2so2968929oag.35 for ; Thu, 08 Nov 2012 02:32:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:in-reply-to:references:date:message-id :subject:from:to:content-type:content-transfer-encoding; bh=80y9g6MED/a8479LYmUpoZj6+g3KMxn/8Hq73NpuW/4=; b=fG/jIPiZTwSDcuiRP4vZ6/boh7HHswr5ykyK61jjxYbWKLNOthkgT2czhYllqEZS6U WOW1JNQWBy0HQ3KZG0GUmL6j53AXqdw81nAo20YDGpHRPERTwfYM5cjG4frKmNnoTsri HpbYWmHmgi093WznmDSKUopkX6ZsEx+ykpIu3AO35xJ1hVA6ryLOzVFqw97oaSJ0Xcsr fPPlhdFGmOkgbxXlse5KLl3RFJRKF/hjXRtSVDrAJx+d9iFGX58AGpU2Cmn1wBN68kBH FO9wVHTMkS9FBt1RQFAIc17gtXXTA/lgjHB8Z2X8JQ4u2O+jVFEVLfoSsvcx8fb8weSs w0tw== MIME-Version: 1.0 Received: by 10.182.86.225 with SMTP id s1mr5164169obz.91.1352370735293; Thu, 08 Nov 2012 02:32:15 -0800 (PST) Received: by 10.60.11.72 with HTTP; Thu, 8 Nov 2012 02:32:15 -0800 (PST) Reply-To: simon.willnauer@gmail.com In-Reply-To: References: <9007805A-10A6-4F41-8436-B73771375979@googlemail.com> Date: Thu, 8 Nov 2012 11:32:15 +0100 Message-ID: Subject: Re: Compressed stored fields and multiGet(sorted luceneId[])? From: Simon Willnauer To: dev@lucene.apache.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org On Thu, Nov 8, 2012 at 11:30 AM, Robert Muir wrote: > Why are you retrieving thousands of stored fields? really you should roll your own codec for this and specialize. This is a very unique usecase. simon > > I don't think we should add such an API: stored fields shouldn't be > used like that, only for things like summary results, so the > possibility of two documents being in the same block is not so high > anyway. > > And i think the api to stay the way it is (simple visitor) to > encourage the fact that people shouldnt use stored fields for > "processing". > > On Thu, Nov 8, 2012 at 2:56 AM, eksdev wrote: >> Just a theoretical question, would it make sense to add some sort of Sto= redDocument[] bulkGet(int[] docId) to fetch multiple stored documents in on= e go? >> >> The reasoning behind is that now with compressed blocks random-access ge= ts more expensive, and in some cases a user needs to fetch more documents= in one go. If it happens that more documents come from one block it is a w= in. I would also assume, even without compression , bulk access on sorted d= ocIds cold be a win (sequential access)? >> >> Does that make sense, is it doable? Or even worse, does it already exist= :) >> >> By the way, I am impressed how well compression does, even on really sho= rt stored documents, approx. 150b we observe 35% reduction. Fetching 1000 = short documents on fully cached index is observably slower (2-3 times), bu= t as soon as you memory gets low, compression wins quickly. Did not test it= thoroughly, but looks good so far. Great job! >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org >> For additional commands, e-mail: dev-help@lucene.apache.org >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org > For additional commands, e-mail: dev-help@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org