jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Atul Kumar Tripathi" <Atulkum...@virtusa.com>
Subject Search doesn't work for MS PowerPoint documents with JackRabbit 2.1.2
Date Mon, 22 Nov 2010 10:56:27 GMT
Hello,

Search doesn't work for MS PowerPoint documents with JackRabbit 2.1.2

This seems an issue in registering Tika Parser for MS PowerPoint
documents.

JackRabbit 2 uses Tika Parsers to parse content of documents and has
written a wrapper (JackrabbitParser) for Tika Parsers to provide
backwards compatibility support to 1.x..

   /**
     * Backwards compatibility method to support old Jackrabbit 1.x
     * <code>textExtractorClasses</code> configurations. Implements a
best
     * effort mapping from the old-style text extractor classes to
     * corresponding Tika parsers.
     *
     * @param classes configured list of text extractor classes
     */
    public void setTextFilterClasses(String classes) {

else if (name.equals(
 
"org.apache.jackrabbit.extractor.MsPowerPointExtractor")) {
                Parser parser = new OfficeParser();
                parsers.put("application/vnd.ms-powerpoint", parser);
                parsers.put("application/mspowerpoint", parser);
                parsers.put("application/powerpoint", parser);
}
else {
                logger.warn("Ignoring unknown text extractor class: {}",
name);
}
   }

Method checks for
"org.apache.jackrabbit.extractor.MsPowerPointExtractor" and creates a
new instances of Tika OfficeParser but there seems a typo as no class
with name "MsPowerPointExtractor" is declared. Class is named
"MsPowerPointTextExtractor" in Jackrabbit 1.6
(jackrabbit-text-extractors-1.6.0 and jackrabbit-text-extractors-1.6.4).

Because of this no parser gets registered for MS PowerPoint documents
and search doesn't work.

Any help would be highly appreciated.

Thanks & Regards. 
Atul Tripathi

Virtusa was recently ranked and featured in 2010 Deloitte Technology Fast 500, 2010 Global
Services 100, IAOP's 2010 Global Outsourcing 100 sub-list and 2010 FinTech 100 among others.

---------------------------------------------------------------------------------------------

This message, including any attachments, contains confidential information intended for a
specific individual and purpose, and is intended for the addressee only. Any unauthorized
disclosure, use, dissemination, copying, or distribution of this message or any of its attachments
or the information contained in this e-mail, or the taking of any action based on it, is strictly
prohibited. If you are not the intended recipient, please notify the sender immediately by
return e-mail and delete this message.

---------------------------------------------------------------------------------------------

Mime
View raw message