> I'm not sure if it would help my particular situation, but is there any way
> to provide the option of specifying the compression level? The level used
> by Lucene (level 9) is the maximum possible compression level. Ideally I
> would like to be able to alter the compression level on the basis of the
> field size. This way I can smooth out the compression times across the
> various document sizes. I am more interested in consistent time than I am
> consistent compression.
I agree, we should make the compression level configurable. It's
disturbing that it takes minutes to compress a 4.5 MB document! I'll
open a Jira issue for this.
> Or... could there some other reason my document takes this long to index?
> (and hold up all other threads).
You might want to try just running the command-line "zip" utility,
specifying best compression, to see how long it takes? Lucene is just
using java.util.zip.* APIs (which is the same compression as "zip").
One correction: this compression should not block other threads. This
runs outside of "synchronized" code, meaning, if you have other threads
adding documents, they can do so fully in parallel with your one thread
that's doing the slow compression.
Mike
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
|