lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ivan Vasilev <ivasi...@sirma.bg>
Subject Re: Compressing field content with Lucene 3.0
Date Tue, 29 Dec 2009 11:19:26 GMT
10x Uwe,

That is fine :)

Cheers,
Ivan

Uwe Schindler wrote:
> It is still open to you how you handle it. On my projects I normally only
> store string fields. If I compress them, they are binary. So I use a similar
> approach like you to compress only large fields and store the others as
> string fields like before.
>
> When I retrieve the contents of the documents I use
> Document.getField("name") and then use the Fieldable.isBinary() to detect if
> it was stored binary (and therefore compressed) or if it is string.
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>   
>> -----Original Message-----
>> From: Ivan Vasilev [mailto:ivasilev@sirma.bg]
>> Sent: Tuesday, December 29, 2009 11:50 AM
>> To: java-user@lucene.apache.org
>> Subject: Re: Compressing field content with Lucene 3.0
>>
>> 10x Uwe for your answer,
>>
>> It is good news that data compressed with Field.Store.COMPRESS with 2.4
>> will be retrieved properly from 3.0.
>>
>>  From your answer I understand that in 3.0 there is no API way to
>> compress some of the values of some field and not to compress other
>> values for the same field. We should choose either to compress all of
>> the values of that field (and keep them as bin values) or not to
>> compress any of values for that field?
>>
>> Cheers,
>> Ivan
>>
>>
>> Uwe Schindler wrote:
>>     
>>> If it is a 2.4 index, you can read it without any problems. It is only
>>>       
>> no
>>     
>>> longer possible to add fields with Field.Store.COMPRESS. Nothing more
>>> changed.
>>>
>>> If you want to add field with some compression, you have to compress
>>> yourself e.g. to a byte[]. You can then add this byte[] as a binary
>>>       
>> stored
>>     
>>> field. How you deal with this is in your responsibility.
>>>
>>> -----
>>> Uwe Schindler
>>> H.-H.-Meier-Allee 63, D-28213 Bremen
>>> http://www.thetaphi.de
>>> eMail: uwe@thetaphi.de
>>>
>>>
>>>       
>>>> -----Original Message-----
>>>> From: Ivan Vasilev [mailto:ivasilev@sirma.bg]
>>>> Sent: Monday, December 28, 2009 7:13 PM
>>>> To: LUCENE MAIL LIST
>>>> Subject: Compressing field content with Lucene 3.0
>>>>
>>>> Hi Guys,
>>>>
>>>> Could you give me advice how to deal with Lucene 3.0 with 2.4 indexes
>>>> that contain compressed data.
>>>>
>>>> Our case is following - we have code like this:
>>>>
>>>> Field.Store fieldStored = storedFieldsSet.contains(fieldName) ?
>>>> (fieldValue.length() >= COMPRESS_THRESHOLD ? Field.Store.COMPRESS :
>>>> Field.Store.YES) : Field.Store.NO;
>>>> Field.Index fieldIndexed = indexedFieldsSet.contains(fieldName) ?
>>>> Field.Index.ANALYZED : Field.Index.NO;
>>>> doc.add(new Field(fieldName, fieldValue, fieldStored, fieldIndexed));
>>>>
>>>> So for one and the same field some values are compressed in the indexes
>>>> other values are not.
>>>> Reading Lucene documentation I understand that all the values for some
>>>> field should be either compressed or not compressed OR we should
>>>>         
>> somehow
>>     
>>>> keep which values for which fields are compressed. If this is so it
>>>> would be very difficult to migrate our system to Lucene 3.0 without
>>>>         
>> need
>>     
>>>> of re indexing.
>>>> If I am not right could you tell me please:
>>>> 1. How to read these indexes (created with the code above with Lucene
>>>> 2.4) with Lucene 3.0? I mean when fetching field values how to know
>>>>         
>> when
>>     
>>>> to use CompressionTools.decompressString(..) and when to skip it?
>>>> 2. Is there possibility in Lucene 3.0 again to compress values of some
>>>> docs for certain field and for other docs the values for the same field
>>>> not to be compressed?
>>>>
>>>> Cheers,
>>>> Ivan
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>         
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>> __________ NOD32 3990 (20090406) Information __________
>>>
>>> This message was checked by NOD32 antivirus system.
>>> http://www.eset.com
>>>
>>>
>>>
>>>
>>>       
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>     
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> __________ NOD32 3990 (20090406) Information __________
>
> This message was checked by NOD32 antivirus system.
> http://www.eset.com
>
>
>
>   


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message