Return-Path: Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: (qmail 67713 invoked from network); 3 Feb 2011 17:15:56 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 3 Feb 2011 17:15:56 -0000 Received: (qmail 18209 invoked by uid 500); 3 Feb 2011 17:15:55 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 18003 invoked by uid 500); 3 Feb 2011 17:15:53 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 17991 invoked by uid 99); 3 Feb 2011 17:15:52 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Feb 2011 17:15:52 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Feb 2011 17:15:50 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 3A14018CA03 for ; Thu, 3 Feb 2011 17:15:29 +0000 (UTC) Date: Thu, 3 Feb 2011 17:15:29 +0000 (UTC) From: "Paul Elschot (JIRA)" To: dev@lucene.apache.org Message-ID: <1670869729.7698.1296753329234.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <916915244.3851.1296599369007.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] Commented: (LUCENE-2903) Improvement of PForDelta Codec MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LUCENE-2903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12990186#comment-12990186 ] Paul Elschot commented on LUCENE-2903: -------------------------------------- When the IntBuffer is produced by ByteBuffer.asIntBuffer() and that ByteBuffer is produced from a byte[], this IntBuffer can be used to compress data into on an int by int basis. After that, this byte[] can be written directly to an IndexOutput. What is it that cannot be avoided? > Improvement of PForDelta Codec > ------------------------------ > > Key: LUCENE-2903 > URL: https://issues.apache.org/jira/browse/LUCENE-2903 > Project: Lucene - Java > Issue Type: Improvement > Reporter: hao yan > Attachments: LUCENE_2903.patch, LUCENE_2903.patch > > > There are 3 versions of PForDelta implementations in the Bulk Branch: FrameOfRef, PatchedFrameOfRef, and PatchedFrameOfRef2. > The FrameOfRef is a very basic one which is essentially a binary encoding (may result in huge index size). > The PatchedFrameOfRef is the implmentation based on the original version of PForDelta in the literatures. > The PatchedFrameOfRef2 is my previous implementation which are improved this time. (The Codec name is changed to NewPForDelta.). > In particular, the changes are: > 1. I fixed the bug of my previous version (in Lucene-1410.patch), where the old PForDelta does not support very large exceptions (since > the Simple16 does not support very large numbers). Now this has been fixed in the new LCPForDelta. > 2. I changed the PForDeltaFixedIntBlockCodec. Now it is faster than the other two PForDelta implementation in the bulk branch (FrameOfRef and PatchedFrameOfRef). The codec's name is "NewPForDelta", as you can see in the CodecProvider and PForDeltaFixedIntBlockCodec. > 3. The performance test results are: > 1) My "NewPForDelta" codec is faster then FrameOfRef and PatchedFrameOfRef for almost all kinds of queries, slightly worse then BulkVInt. > 2) My "NewPForDelta" codec can result in the smallest index size among all 4 methods, including FrameOfRef, PatchedFrameOfRef, and BulkVInt, and itself) > 3) All performance test results are achieved by running with "-server" instead of "-client" -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org