lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicolas Lalevée <nicolas.lale...@anyware-tech.com>
Subject Re: [jira] Commented: (LUCENE-648) Allow changing of ZIP compression level for compressed fields
Date Tue, 22 Aug 2006 16:45:49 GMT
Le Mercredi 16 Août 2006 14:51, Grant Ingersoll a écrit :
> On Aug 16, 2006, at 8:32 AM, Nicolas Lalevée wrote:
> > Hi,
> >
> > In the issue, you wrote that "This way the indexing level just
> > stores opaque
> > binary fields, and then Document handles compress/uncompressing as
> > needed."
> >
> > I have looked into the Lucene code, and it seems to me that it is
> > Field that
> > should take care of compress/uncompress, and it is the FieldsReader
> > and
> > FieldsWriter that should only view binary data.
> > Or you mean that compression should be completely external to Lucene ?
>
> I believe the consensus is it should be done externally.
>
> > In fact, from the end of the other thread "Flexible index format /
> > Payloads
> > Cont'd", I was discussing about how to cutomize the way data are
> > stored. So I
> > have looked deeper in the code and I think I have found a way to do
> > so. And
> > as you could change the way is it stored, you also can define the
> > compression
> > level, or handle your own compression algorithm. I will show you a
> > patch, but
> > I have modified so much code because of my sevral tries, that I
> > need first to
> > remove the unecessary changes. To describe it shortly :
> > - I have provided a way to provide you own FieldsReader and
> > FieldsWriter (via
> > a factory). To create a IndexReader, you have to provide that
> > factory; the
> > actual API is just using a default factory.
> > - I have moved the code of FieldsReader and FieldsReader that do
> > the field
> > data reading to a new class FieldData. The FieldsReader instanciates a
> > FieldData, do a fielddata.read(input), and do a new Field
> > (fielddata,...). The
> > FieldsReader do a field.getFieldData().write(output);
> > - so extending FieldsReader, you can provide you own implementation of
> > FieldData, so you can implement the way you want how data are
> > stored and
> > read.
> > The tests pass successfully, but I have an issue with that design :
> > one thing
> > that is important I think is that in the current design, we can
> > read an index
> > in an old format, and just do a writer.addIndexes() into a new
> > format. With
> > the new design, you cannot, because the writer will use the
> > FieldData.write
> > provided by the reader.
> > To be continued...
>
> I would love to see this patch.  I think one could make a pretty good
> argument for this kind of implementation being done "cleanly", that
> is, it shouldn't necessarily involve reworking the internals, but
> instead could represent the foundation for a new, codec based
> indexing mechanism (with an implementation that can read/write the
> existing file format.)

here it is : https://issues.apache.org/jira/browse/LUCENE-662

enjoy !

Nicolas

-- 
Nicolas LALEVÉE
Solutions & Technologies
ANYWARE TECHNOLOGIES
Tel : +33 (0)5 61 00 52 90
Fax : +33 (0)5 61 00 51 46
http://www.anyware-tech.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message