It's generally considered best practice to compress things first in
your app and then add them as a binary field. That being said, I
don't see why that would blow up on it's own. Have you tried
compressing it outside of Lucene to see what happens? If you can
reproduce it as a test case for Lucene, that would be great.
From FieldsWriter, Lucene's compression code looks like:
private final byte[] compress (byte[] input) {
// Create the compressor with highest level of compression
Deflater compressor = new Deflater();
compressor.setLevel(Deflater.BEST_COMPRESSION);
// Give the compressor the data to compress
compressor.setInput(input);
compressor.finish();
/*
* Create an expandable byte array to hold the compressed data.
* You cannot use an array that's the same size as the orginal
because
* there is no guarantee that the compressed data will be
smaller than
* the uncompressed data.
*/
ByteArrayOutputStream bos = new
ByteArrayOutputStream(input.length);
// Compress the data
byte[] buf = new byte[1024];
while (!compressor.finished()) {
int count = compressor.deflate(buf);
bos.write(buf, 0, count);
}
compressor.end();
// Get the compressed data
return bos.toByteArray();
}
There is an interesting comment in that code about how the compressed
data won't necessarily be smaller, so maybe you have entered the
compression twilight zone.
HTH
-Grant
On Apr 2, 2008, at 12:51 AM, Sebastin wrote:
>
> Hi All,
> is there any possibility to create compression store for the
> following types of string in lucene index store?
>
>
> String str = "II0264.D05|00022745|ABCDE|03/01/2008 00:23:12|00035|
> 9840836588| 129382152520| 04F4243B600408|04F4243B600408|
> |11919898456123|354943011025810L| "CPTBS2I"| "ABCD3E"|11|
> 1234510003243219I|"
>
>
> I try to store these fields as Field.Store.COMPRESSION but it
> exceeds the
> original size of the file?
>
>
> --
> View this message in context: http://www.nabble.com/Lucene-Compression-tp16442112p16442112.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
--------------------------
Grant Ingersoll
http://www.lucenebootcamp.com
Next Training: April 7, 2008 at ApacheCon Europe in Amsterdam
Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
|