Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 75489 invoked from network); 31 Mar 2010 10:30:53 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 31 Mar 2010 10:30:53 -0000 Received: (qmail 1641 invoked by uid 500); 31 Mar 2010 10:30:52 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 1469 invoked by uid 500); 31 Mar 2010 10:30:52 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 1263 invoked by uid 99); 31 Mar 2010 10:30:51 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 31 Mar 2010 10:30:51 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 31 Mar 2010 10:30:48 +0000 Received: from brutus.apache.org (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 4B7DB234C4C6 for ; Wed, 31 Mar 2010 10:30:27 +0000 (UTC) Message-ID: <2135584659.600861270031427308.JavaMail.jira@brutus.apache.org> Date: Wed, 31 Mar 2010 10:30:27 +0000 (UTC) From: "Toke Eskildsen (JIRA)" To: java-dev@lucene.apache.org Subject: [jira] Commented: (LUCENE-1990) Add unsigned packed int impls in oal.util In-Reply-To: <486796805.1255870411278.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LUCENE-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851806#action_12851806 ] Toke Eskildsen commented on LUCENE-1990: ---------------------------------------- I am very happy to hear that, Robert. The benchmarks I made had the glaring flaw that they were ... well, benchmarks. With the CPU-cache being hammered in a real world scenario, your findings indicate that the slow round-trip to main memory dwarfs the extra logic for extracting the values from the packed structure. For a few scenarios, it might even be faster than plain arrays. Getting back to reality, my own findings indicates that using PackedInts for ord-based sorted search is not at all faster than plain arrays. The access pattern here is very sequential, so the chance that the needed value is already fetched from main memory is high for both plain and packed structures. > Add unsigned packed int impls in oal.util > ----------------------------------------- > > Key: LUCENE-1990 > URL: https://issues.apache.org/jira/browse/LUCENE-1990 > Project: Lucene - Java > Issue Type: Improvement > Components: Index > Affects Versions: Flex Branch > Reporter: Michael McCandless > Priority: Minor > Fix For: Flex Branch > > Attachments: generated_performance-te20100226.txt, LUCENE-1990-te20100122.patch, LUCENE-1990-te20100210.patch, LUCENE-1990-te20100212.patch, LUCENE-1990-te20100223.patch, LUCENE-1990-te20100226.patch, LUCENE-1990-te20100226b.patch, LUCENE-1990-te20100226c.patch, LUCENE-1990-te20100301.patch, LUCENE-1990.patch, LUCENE-1990.patch, LUCENE-1990_PerformanceMeasurements20100104.zip, perf-mkm-20100227.txt, performance-20100301.txt, performance-te20100226.txt > > > There are various places in Lucene that could take advantage of an > efficient packed unsigned int/long impl. EG the terms dict index in > the standard codec in LUCENE-1458 could subsantially reduce it's RAM > usage. FieldCache.StringIndex could as well. And I think "load into > RAM" codecs like the one in TestExternalCodecs could use this too. > I'm picturing something very basic like: > {code} > interface PackedUnsignedLongs { > long get(long index); > void set(long index, long value); > } > {code} > Plus maybe an iterator for getting and maybe also for setting. If it > helps, most of the usages of this inside Lucene will be "write once" > so eg the set could make that an assumption/requirement. > And a factory somewhere: > {code} > PackedUnsignedLongs create(int count, long maxValue); > {code} > I think we should simply autogen the code (we can start from the > autogen code in LUCENE-1410), or, if there is an good existing impl > that has a compatible license that'd be great. > I don't have time near-term to do this... so if anyone has the itch, > please jump! -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org