cxf-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Beryozkin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CXF-5549) Introduce Tika Search Visitor
Date Thu, 19 Jun 2014 10:12:24 GMT

    [ https://issues.apache.org/jira/browse/CXF-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14037199#comment-14037199
] 

Sergey Beryozkin commented on CXF-5549:
---------------------------------------

Hi Andriy, thanks for the latest update, it let me confirm that we could move tika-parsers
into a test scope.

I guess we'd create some Lucene specific helper a bit later on that will deal with checking
that a query matches a given document, which I can contribute easily.

Can you give me a favour and play a bit with adding a couple of more tests for some other
file types, say some binary file which has no text content and only metadata ? Our code should
work OK, from what I understand a tika parser will return an empty value if it has no content,
but lets double check. The other possible area to explore: we have a single Document keeping
the text and the metadata, can we hit a problem where a user looking for some metadata gets
a wrong document due to the content getting a match, if yes then we'd need to have 2 documents
created instead...

Thanks, Sergey

> Introduce Tika Search Visitor
> -----------------------------
>
>                 Key: CXF-5549
>                 URL: https://issues.apache.org/jira/browse/CXF-5549
>             Project: CXF
>          Issue Type: New Feature
>          Components: JAX-RS
>            Reporter: Sergey Beryozkin
>            Assignee: Andriy Redko
>            Priority: Minor
>
> Introduce TikaSearchVisitor which will convert FIQL/etc search expression into Apache
Tika component that can be used to search the binary data; for example, the service can support
something like "find all PDF files matching a given expression"



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message