jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jukka Zitting (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (JCR-2873) Add a way to locate full text extraction problems
Date Thu, 24 Mar 2011 17:53:05 GMT

    [ https://issues.apache.org/jira/browse/JCR-2873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13010800#comment-13010800
] 

Jukka Zitting commented on JCR-2873:
------------------------------------

Yes, to the search index such documents look like simple text documents that contain just
the string "TextExtractionError". You can query for that token and include any other constraints
(path, etc.) just like when searching for normal documents.

PS. In revision 1085050 I excluded extraction errors caused by linkage problems from being
reported. They are caused by required extraction libraries not being present, which is a configuration/deployment
choice instead of any inherent problems with the documents being parsed.

> Add a way to locate full text extraction problems
> -------------------------------------------------
>
>                 Key: JCR-2873
>                 URL: https://issues.apache.org/jira/browse/JCR-2873
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: indexing, jackrabbit-core
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>            Priority: Minor
>             Fix For: 2.3.0
>
>
> Full text indexing of a binary document can fail for various reasons. Currently we just
log a generic error message in such cases, which makes it difficult for the user to locate
such problems for review and reindexing. We should improve this by making the logs more informative
or by adding some other mechanism for locating troublesome documents.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message