tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jukka Zitting (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (TIKA-126) Add Parser.parse(InputStream, Metadata) for metadata extraction
Date Fri, 19 Sep 2008 22:16:47 GMT

     [ https://issues.apache.org/jira/browse/TIKA-126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jukka Zitting resolved TIKA-126.
--------------------------------

    Resolution: Later

I reverted this feature in revision 697265 based on the above rationale.

Resolving this issue with status Later. We can consider restoring this issue when we have
some more compelling use cases. For now I prefer to keep the Parser interface as clean and
simple as possible.

> Add Parser.parse(InputStream, Metadata) for metadata extraction
> ---------------------------------------------------------------
>
>                 Key: TIKA-126
>                 URL: https://issues.apache.org/jira/browse/TIKA-126
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>             Fix For: 0.2-incubating
>
>
> In some cases a client is just interested in the parsed metadata and not the extracted
text content. It is easy to ignore the text content by just passing a dummy DefaultHandler
to the existing parse() method, but many parsers could avoid a lot of work if they knew in
advance that the text content is not needed.
> Thus I want to add a parse(InputStream, Metadata) signature to the Parser interface.
I'll also add an AbstractParser base class with a trivial implementation of that method:
>     public abstract AbstractParser implements Parser {
>         public void parse(InputStream stream, Metadata metadata) {
>             parse(stream, new DefaultHandler(), metadata);
>         }
>     }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message