jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paco Avila (JIRA)" <j...@apache.org>
Subject [jira] Commented: (JCR-728) Automatic MIME type detection
Date Thu, 01 Feb 2007 19:55:05 GMT

    [ https://issues.apache.org/jira/browse/JCR-728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469564

Paco Avila commented on JCR-728:

Why LGPL is troublesome? Source code using a LGPL library does not have to be LGPL or GPL.
A port of libmagic to Java should be nice because there is lots of MIME definitions in its

And yes, I think that is more useful to add more functionality to jackrabbit-index-filters.
By the way some MS Office files thows errors when they are indexed. I know this is a POI issue,
but is this project abandoned? There is no updates since 04-08-2004 :(

> Automatic MIME type detection
> -----------------------------
>                 Key: JCR-728
>                 URL: https://issues.apache.org/jira/browse/JCR-728
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>            Reporter: Jukka Zitting
>            Priority: Minor
> Currently only the jcr:mimeType property is used to determine the MIME type and thus
the applicable text extractor to use for indexing a document. If the jcr:mimeType property
is not available or is set to a generic value like "application/octet-stream", then the indexer
could also use some heuristics based on the node name or magic numbers within the binary stream
to determine the type of the document.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message