tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <j...@apache.org>
Subject [jira] Commented: (TIKA-200) Allow URL drag and drop in the Tika GUI
Date Thu, 19 Mar 2009 08:37:50 GMT

    [ https://issues.apache.org/jira/browse/TIKA-200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12683363#action_12683363

Uwe Schindler commented on TIKA-200:

For a more advanced parsing of content type and also support of compressed HTTP streams, have
a look at http://panfmp.svn.sourceforge.net/viewvc/panfmp/main/trunk/src/de/pangaea/metadataportal/harvester/OAIHarvesterBase.java?view=markup
line 177 ff.
This is a nice method that creates a SAX InputSource with all properties correctly set from
an HTTP urlswith some extra features, the InputSource with only a given SystemID does not
support (compression, retry-after). For the underlying parser to work correct, the charset
encoding should be set (if available from the HTTP response). This mpore complex example was
needed for an OAI-PMH harvester for effective metadata harvesting with compression and so

> Allow URL drag and drop in the Tika GUI
> ---------------------------------------
>                 Key: TIKA-200
>                 URL: https://issues.apache.org/jira/browse/TIKA-200
>             Project: Tika
>          Issue Type: New Feature
>          Components: gui
>            Reporter: Jukka Zitting
>            Priority: Minor
>             Fix For: 0.4
>         Attachments: TIKA-200.diff
> It would be nice if I could drag a URL from my browser to the Tika GUI window to have
the linked document downloaded and parsed by Tika.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message