lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Field compression too slow
Date Thu, 10 Aug 2006 13:25:03 GMT

> I'm not sure if it would help my particular situation, but is there any way
> to provide the option of specifying the compression level?  The level used
> by Lucene (level 9) is the maximum possible compression level.  Ideally I
> would like to be able to alter the compression level on the basis of the
> field size.  This way I can smooth out the compression times across the
> various document sizes.  I am more interested in consistent time than I am
> consistent compression.

I agree, we should make the compression level configurable.  It's 
disturbing that it takes minutes to compress a 4.5 MB document!  I'll 
open a Jira issue for this.

> Or... could there some other reason my document takes this long to index?
> (and hold up all other threads).

You might want to try just running the command-line "zip" utility, 
specifying best compression, to see how long it takes?  Lucene is just 
using java.util.zip.* APIs (which is the same compression as "zip").

One correction: this compression should not block other threads.  This 
runs outside of "synchronized" code, meaning, if you have other threads 
adding documents, they can do so fully in parallel with your one thread 
that's doing the slow compression.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message