Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 10154 invoked from network); 7 Feb 2009 17:26:47 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 7 Feb 2009 17:26:47 -0000 Received: (qmail 60349 invoked by uid 500); 7 Feb 2009 17:26:41 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 60300 invoked by uid 500); 7 Feb 2009 17:26:41 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 60291 invoked by uid 99); 7 Feb 2009 17:26:41 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 07 Feb 2009 09:26:41 -0800 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [80.190.230.99] (HELO mail.troja.net) (80.190.230.99) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 07 Feb 2009 17:26:33 +0000 Received: from localhost (localhost [127.0.0.1]) by mail.troja.net (Postfix) with ESMTP id 576D34F45E; Sat, 7 Feb 2009 18:26:10 +0100 (CET) Received: from mail.troja.net ([127.0.0.1]) by localhost (cyca.troja.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 29898-07; Sat, 7 Feb 2009 18:26:09 +0100 (CET) Received: from VEGA (port-83-236-62-11.dynamic.qsc.de [83.236.62.11]) (using SSLv3 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.troja.net (Postfix) with ESMTP id E1D954F329; Sat, 7 Feb 2009 18:26:08 +0100 (CET) From: "Uwe Schindler" To: , References: <0A00338070614CD895159FB9760703BB@VEGA> Subject: RE: TrieRange Date: Sat, 7 Feb 2009 18:26:09 +0100 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 In-Reply-To: Thread-Index: AcmJKnt64qVGLT1gSEaf0JS/SdOnLAAHiHUg X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5579 X-Virus-Checked: Checked by ClamAV on apache.org Hi Yonik, > > An optimization might be to remove > > the lower 0 bits from the string, but it would not be needed. The > strings > > are unique for one precision (no difference between 0-bits there or > not). > > Yes, one would certainly want to remove trailing bits that were > insignificant. > > To optimize index space, one would want to "right justify" the encoded > number for any bit range to minimize variation on the left - this > plays into lucene's prefix compression. I am not sure, if this is the right way. Lucene's prefix compression is also good for seeking fast to the term. If thousands of terms, only varying in the last bits (because all bits before are zero), must be scanned to get to the right one, it would get less performant. I would pack all bits to the begiing to optimize the prefix usage. Maybe the index gets bigger, but trierangefilter needs fast seeking to the right terms. Uwe --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org