jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paco Avila (JIRA)" <j...@apache.org>
Subject [jira] Commented: (JCR-728) Automatic MIME type detection
Date Thu, 01 Feb 2007 19:55:05 GMT

    [ https://issues.apache.org/jira/browse/JCR-728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469564
] 

Paco Avila commented on JCR-728:
--------------------------------

Why LGPL is troublesome? Source code using a LGPL library does not have to be LGPL or GPL.
A port of libmagic to Java should be nice because there is lots of MIME definitions in its
format.

And yes, I think that is more useful to add more functionality to jackrabbit-index-filters.
By the way some MS Office files thows errors when they are indexed. I know this is a POI issue,
but is this project abandoned? There is no updates since 04-08-2004 :(

> Automatic MIME type detection
> -----------------------------
>
>                 Key: JCR-728
>                 URL: https://issues.apache.org/jira/browse/JCR-728
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>            Reporter: Jukka Zitting
>            Priority: Minor
>
> Currently only the jcr:mimeType property is used to determine the MIME type and thus
the applicable text extractor to use for indexing a document. If the jcr:mimeType property
is not available or is set to a generic value like "application/octet-stream", then the indexer
could also use some heuristics based on the node name or magic numbers within the binary stream
to determine the type of the document.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message