Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 33463 invoked from network); 1 Apr 2007 19:26:57 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 1 Apr 2007 19:26:57 -0000 Received: (qmail 10064 invoked by uid 500); 1 Apr 2007 19:27:03 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 9621 invoked by uid 500); 1 Apr 2007 19:27:01 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 9610 invoked by uid 99); 1 Apr 2007 19:27:01 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 01 Apr 2007 12:27:01 -0700 X-ASF-Spam-Status: No, hits=-100.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 01 Apr 2007 12:26:53 -0700 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id DAFBD71406C for ; Sun, 1 Apr 2007 12:26:32 -0700 (PDT) Message-ID: <7176367.1175455592893.JavaMail.jira@brutus> Date: Sun, 1 Apr 2007 12:26:32 -0700 (PDT) From: =?utf-8?Q?Nicolas_Lalev=C3=A9e_=28JIRA=29?= To: java-dev@lucene.apache.org Subject: [jira] Updated: (LUCENE-662) Extendable writer and reader of field data In-Reply-To: <19231794.1156264813957.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LUCENE-662?page=3Dcom.atlassia= n.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Lalev=C3=A9e updated LUCENE-662: ----------------------------------- Attachment: indexFormat.patch indexFormat-only.patch Synchronized with the trunk, so with the payload feature. It allowed me to = refactor in one class the payload writing which is in two places today : it= is now in the DefaultPostingWriter class. >From my last update, the TODO list is still to do, nothing has been fixed. = Furthermore there is a regression in the new patch : the ensureOpen() is no= t correctly handled for lazy loaded fields : a test fail. This is due to th= e fact that the FieldsReader doesn't handle it anymore in my patch. As the = data struture can be customized, lazy loading is exported to the FieldData = created by the FieldsReader. So the both instance have to communicate about= the closing of the streams. So a new item in the TODO list. As discussed in java-dev, here is a light patch with only the index format = handling, without the possibility to redefine how data and postings are sto= re/retreived. > Extendable writer and reader of field data > ------------------------------------------ > > Key: LUCENE-662 > URL: https://issues.apache.org/jira/browse/LUCENE-662 > Project: Lucene - Java > Issue Type: Improvement > Components: Store > Reporter: Nicolas Lalev=C3=A9e > Priority: Minor > Attachments: entrytable.patch, generic-fieldIO-2.patch, generic-f= ieldIO-3.patch, generic-fieldIO-4.patch, generic-fieldIO-5.patch, generic-f= ieldIO.patch, indexFormat-only.patch, indexFormat.patch, indexFormat.patch,= indexFormat.patch > > > As discussed on the dev mailing list, I have modified Lucene to allow to = define how the data of a field is writen and read in the index. > Basically, I have introduced the notion of IndexFormat. It is in fact a f= actory of FieldsWriter and FieldsReader. So the IndexReader, the indexWrite= r and the SegmentMerger are using this factory and not doing a "new FieldsR= eader/Writer()". > I have also introduced the notion of FieldData. It handles every data of = a field, and also the writing and the reading in a stream. I have done this= way because in the current design of Lucene, Fiedable is an interface, so = methods with a protected or package visibility cannot be defined. > A FieldsWriter just writes data into a stream via the FieldData of the fi= eld. > A FieldsReader instanciates a FieldData depending on the field name. Then= it use the field data to read the stream. And finnaly it instanciates a Fi= eld with the field data. > About compatibility, I think it is kept, as I have writen a DefaultIndexF= ormat that provides some DefaultFieldsWriter and DefaultFieldsReader. These= implementations do the exact job that is done today. > To acheive this modification, some classes and methods had to be moved fr= om private and/or final to public or protected. > About the lazy fields, I have implemented them in a more general way in t= he implementation of the abstract class FieldData, so it will be totally tr= ansparent for the Lucene user that will extends FieldData. The stream is ke= pt in the fieldData and used as soon as the stringValue (or something else)= is called. Implementing this way allowed me to handle the recently introdu= ced LOAD_FOR_MERGE; it is just a lazy field data, and when read() is called= on this lazy field data, the saved input stream is directly copied in the = output stream. > I have a last issue with this patch. The current design allow to read an = index in an old format, and just do a writer.addIndexes() into a new format= . With the new design, you cannot, because the writer will use the FieldDat= a.write provided by the reader. > enjoy ! --=20 This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org