jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Miklos Pocsaji <miklos.pocs...@gmail.com>
Subject Two problems
Date Thu, 16 Jun 2005 09:23:55 GMT
Hi!

Started working with Jackrabbit a month ago and I ran into two problems:

1. I saw a post here that the time-consuming startup is maintained by
somebody. Is there an improvement? Even if there are few hundred
megabytes of stored data, startup time (repository creation is really
slow)

2. I started writing a TextFilter which knows how to extract text from
PDF (I implemented the TextFilter interface). It is simple, I only
have to return a java.io.Reader from which Jackrabbit extracts text.
Obvious and ugly method would be to extract a text to a string and
then return a StringReader but this would require a lot of memory. I
decided to use PiperReader-PipedWriter - a separate thread writes the
text to a PipedWriter and I return the PipedReader instance from the
doFilter() method. It seems that Jackrabbit won't read through the
passed stream immediately. I see my writing thread to stop, then after
performing a search, it throws an exception that the other end of the
pipe is closed...
I do not know if my approach is correct, so if somebody could, please
inform me if this thing could work somehow. I'm thinking about
examining the source itself but if somebody could help me I can spare
a lot of time.

Thank you in advance,
Miklos Pocsaji.

Mime
View raw message