jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jukka Zitting <jukka.zitt...@gmail.com>
Subject Re: detect a failed text extraction?
Date Wed, 25 Nov 2009 13:26:41 GMT
Hi,

On Tue, Nov 24, 2009 at 8:53 PM, Paco Avila <monkiki@gmail.com> wrote:
> There is any way to detect a failed text extraction ? I know, I can
> see the log but the failure it not associated to a file or path.
> [...]
> I have posted this question in the user list, but I think it is
> interesting talking about how it can be achieved.

Could we solve this by improving the level of logging in the indexer?

Alternatively, if you don't have easy access to the log files, we
could possibly inject some special unique term to the index as a
marker of failed text extraction. That way you could query for all
nodes for which text extraction failed.

Finally, as a debugging tool we could add a feature to the Jackrabbit
webapp that allows you to download the extracted text content of a
binary instead of the binary itself. We'd simply run a new text
extraction pass on the stored binary and return the extracted text or
any encountered errors to he client.

BR,

Jukka Zitting

Mime
View raw message