Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 90098 invoked from network); 14 Dec 2005 16:50:14 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 14 Dec 2005 16:50:14 -0000 Received: (qmail 4261 invoked by uid 500); 14 Dec 2005 16:50:09 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 4211 invoked by uid 500); 14 Dec 2005 16:50:08 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 4191 invoked by uid 99); 14 Dec 2005 16:50:08 -0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [192.87.106.226] (HELO ajax.apache.org) (192.87.106.226) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 Dec 2005 08:50:08 -0800 Received: from ajax.apache.org (ajax.apache.org [127.0.0.1]) by ajax.apache.org (Postfix) with ESMTP id E1C7922B for ; Wed, 14 Dec 2005 17:49:46 +0100 (CET) Message-ID: <677237817.1134578986922.JavaMail.jira@ajax.apache.org> Date: Wed, 14 Dec 2005 17:49:46 +0100 (CET) From: "Chuck Williams (JIRA)" To: java-dev@lucene.apache.org Subject: [jira] Updated: (LUCENE-362) [PATCH] Extension to binary Fields that allows fixed byte buffer Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N [ http://issues.apache.org/jira/browse/LUCENE-362?page=all ] Chuck Williams updated LUCENE-362: ---------------------------------- Attachment: FixedBufferBinaryFields.patch (Thanks Eric for correcting my mistaken posting to the old issue tracking system) Better late than never I hope. FixedBufferBinaryFields.patch is revised to apply against the latest source and now includes a test case (extension of TestBinaryDocument). This is my last current local patch to Lucene, so it would be great if it gets committed. The value again is to eliminate copying of large binary values to be stored in the Lucene index. For a compressed document, for example, if the documents are read and compressed externally in a fixed buffer and the compressed buffer is passed in, all copying can be eliminated. Chuck > [PATCH] Extension to binary Fields that allows fixed byte buffer > ---------------------------------------------------------------- > > Key: LUCENE-362 > URL: http://issues.apache.org/jira/browse/LUCENE-362 > Project: Lucene - Java > Type: Bug > Components: Index > Versions: CVS Nightly - Specify date in submission > Environment: Operating System: All > Platform: All > Reporter: Chuck Williams > Assignee: Lucene Developers > Attachments: Field-extension.patch, Field-extension.patch, FieldsWriter-extension.patch, FixedBufferBinaryFields.patch > > This is a very simple patch that supports storing binary values in the index > more efficiently. A new Field constructor accepts a length argument, allowing a > fixed byte[] to be reused acrossed multiple calls with arguments of different > sizes. A companion change to FieldsWriter uses this length when storing and/or > compressing the field. > There is one remaining case in Document. Intentionally, no direct accessor to > the length of a binary field is provided from Document, only from Field. This > is because Field's created by FieldReader will never have a specified length and > this is usual case for Field's read from Document. It seems less confusing for > most users. > I don't believe any upward incompatibility is introduced here (e.g., from the > possibility of getting a larger byte[] than actually holds the value from > Document), since no such byte[] values are possible without this patch anyway. > The compression case is still inefficient (much copying), but it is hard to see > how Lucene can do too much better. However, the application can do the > compression externally and pass in the reused compression-output buffer as a > binary value (which is what I'm doing). This represents a substantialy > allocation savings for storing large documents bodies (compressed) into the > Lucene index. > Two patch files are attached, both created by svn on 3/17/05. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org