jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jukka Zitting <jukka.zitt...@gmail.com>
Subject Re: Index Large Repository
Date Sat, 19 Jun 2010 11:31:40 GMT

On Fri, Jun 18, 2010 at 3:09 PM, HOCHMUTH, ERICH [AG/1000]
<erich.hochmuth@monsanto.com> wrote:
> Is there anything that I can do from a configuration perspective that can
> speed this indexing up?

Typically the most time in indexing is spent extracting the full text
content of binaries, especially when they're in complex document
formats like PDF or MS Word. If you don't need full text search over
such files, you can disable text extraction by modifying the
textFilterClasses parameter or by removing the PDFBox and POI
libraries from the classpath.

> Can I reduce the number of historical versions that Jackrabbit keeps of each
> item in the repository?

Yes, see the VersionHistory.removeVersion() method [1].

[1] http://www.day.com/maven/javax.jcr/javadocs/jcr-2.0/javax/jcr/version/VersionHistory.html#removeVersion(java.lang.String)


Jukka Zitting

View raw message