Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E375C9E8E for ; Wed, 16 May 2012 10:55:32 +0000 (UTC) Received: (qmail 89321 invoked by uid 500); 16 May 2012 10:55:30 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 89137 invoked by uid 500); 16 May 2012 10:55:29 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 89085 invoked by uid 99); 16 May 2012 10:55:29 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 May 2012 10:55:29 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 May 2012 10:55:23 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 42B5669BD for ; Wed, 16 May 2012 10:55:02 +0000 (UTC) Date: Wed, 16 May 2012 10:55:02 +0000 (UTC) From: "Michael McCandless (JIRA)" To: dev@lucene.apache.org Message-ID: <606300315.3629.1337165702274.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <272653093.3527.1337162342441.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (LUCENE-4062) More fine-grained control over the packed integer implementation that is chosen MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/LUCENE-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13276656#comment-13276656 ] Michael McCandless commented on LUCENE-4062: -------------------------------------------- This is a great idea! I don't think it's necessary to create a Packed32SingleBlock right now ... this (Packed64SingleBlock) is already a great improvement. We can do it later... Somehow we need to fix the fasterButMoreRAM places (FieldCache, DocValues) to make use of this; maybe we change them from a boolean to the float acceptableOverhead instead? > More fine-grained control over the packed integer implementation that is chosen > ------------------------------------------------------------------------------- > > Key: LUCENE-4062 > URL: https://issues.apache.org/jira/browse/LUCENE-4062 > Project: Lucene - Java > Issue Type: Improvement > Components: core/other > Reporter: Adrien Grand > Priority: Minor > Labels: performance > Fix For: 4.1 > > Attachments: LUCENE-4062.patch > > > In order to save space, Lucene has two main PackedInts.Mutable implentations, one that is very fast and is based on a byte/short/integer/long array (Direct*) and another one which packs bits in a memory-efficient manner (Packed*). > The packed implementation tends to be much slower than the direct one, which discourages some Lucene components to use it. On the other hand, if you store 21 bits integers in a Direct32, this is a space loss of (32-21)/32=35%. > If you accept to trade some space for speed, you could store 3 of these 21 bits integers in a long, resulting in an overhead of 1/3 bit per value. One advantage of this approach is that you never need to read more than one block to read or write a value, so this can be significantly faster than Packed32 and Packed64 which always need to read/write two blocks in order to avoid costly branches. > I ran some tests, and for 10000000 21 bits values, this implementation takes less than 2% more space and has 44% faster writes and 30% faster reads. The 12 bits version (5 values per block) has the same performance improvement and a 6% memory overhead compared to the packed implementation. > In order to select the best implementation for a given integer size, I wrote the {{PackedInts.getMutable(valueCount, bitsPerValue, acceptableOverheadPerValue)}} method. This method select the fastest implementation that has less than {{acceptableOverheadPerValue}} wasted bits per value. For example, if you accept an overhead of 20% ({{acceptableOverheadPerValue = 0.2f * bitsPerValue}}), which is pretty reasonable, here is what implementations would be selected: > * 1: Packed64SingleBlock1 > * 2: Packed64SingleBlock2 > * 3: Packed64SingleBlock3 > * 4: Packed64SingleBlock4 > * 5: Packed64SingleBlock5 > * 6: Packed64SingleBlock6 > * 7: Direct8 > * 8: Direct8 > * 9: Packed64SingleBlock9 > * 10: Packed64SingleBlock10 > * 11: Packed64SingleBlock12 > * 12: Packed64SingleBlock12 > * 13: Packed64 > * 14: Direct16 > * 15: Direct16 > * 16: Direct16 > * 17: Packed64 > * 18: Packed64SingleBlock21 > * 19: Packed64SingleBlock21 > * 20: Packed64SingleBlock21 > * 21: Packed64SingleBlock21 > * 22: Packed64 > * 23: Packed64 > * 24: Packed64 > * 25: Packed64 > * 26: Packed64 > * 27: Direct32 > * 28: Direct32 > * 29: Direct32 > * 30: Direct32 > * 31: Direct32 > * 32: Direct32 > * 33: Packed64 > * 34: Packed64 > * 35: Packed64 > * 36: Packed64 > * 37: Packed64 > * 38: Packed64 > * 39: Packed64 > * 40: Packed64 > * 41: Packed64 > * 42: Packed64 > * 43: Packed64 > * 44: Packed64 > * 45: Packed64 > * 46: Packed64 > * 47: Packed64 > * 48: Packed64 > * 49: Packed64 > * 50: Packed64 > * 51: Packed64 > * 52: Packed64 > * 53: Packed64 > * 54: Direct64 > * 55: Direct64 > * 56: Direct64 > * 57: Direct64 > * 58: Direct64 > * 59: Direct64 > * 60: Direct64 > * 61: Direct64 > * 62: Direct64 > Under 32 bits per value, only 13, 17 and 22-26 bits per value would still choose the slower Packed64 implementation. Allowing a 50% overhead would prevent the packed implementation to be selected for bits per value under 32. Allowing an overhead of 32 bits per value would make sure that a Direct* implementation is always selected. > Next steps would be to: > * make lucene components use this {{getMutable}} method and let users decide what trade-off better suits them, > * write a Packed32SingleBlock implementation if necessary (I didn't do it because I have no 32-bits computer to test the performance improvements). > I think this would allow more fine-grained control over the speed/space trade-off, what do you think? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org