tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Ott (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TIKA-697) Tika reports the content type of AR archives as "text/plain"
Date Mon, 07 Nov 2011 13:12:51 GMT

    [ https://issues.apache.org/jira/browse/TIKA-697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13145496#comment-13145496

Alex Ott commented on TIKA-697:

No problem, just add:

<glob pattern="*.ar"/>


<glob pattern="*.a"/>

... But I really never saw such file extension
> Tika reports the content type of AR archives as "text/plain"
> ------------------------------------------------------------
>                 Key: TIKA-697
>                 URL: https://issues.apache.org/jira/browse/TIKA-697
>             Project: Tika
>          Issue Type: Bug
>         Environment: Linux (CentOS 5.6)
>            Reporter: PNS
>            Priority: Trivial
>         Attachments: tika-697.diff
> The Tika.detect(InputStream) method returns "text/plain" for AR archives created with
the Linux "Create Archive" option of Nautilus (available via right-clicking on a file).
> The Apache Commons Compress "autodetection" code of the ArchiveStreamFactory looks at
the first 12 bytes of the stream and correctly identifies the type as AR.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message