camel-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergey Beryozkin <sberyoz...@gmail.com>
Subject Re: Apache Tika Component
Date Mon, 23 Jan 2017 11:22:32 GMT
Hi Bob, +1

Cheers. Sergey
On 23/01/17 04:17, Bob Paulin wrote:
> Hi,
>
> I'd like to propose an Apache Tika[1] connector for Apache Camel.  I see
> Camel uses a number of Tika components like PDFBox but it could be
> interesting to have a full assortment of file parsers to convert files
> to text.
>
> The basic configuration would allow MIME type detection and parsing
> files to text.
>
> tika:detect
>
> File/Inputstream -> camel-tika -> MIME Type
>
> tika:parse
>
> File/Inputstream ->  camel-tika -> OutputStream in text
>
> I have a basic implementation that I'd be happy to send in a PR but I
> wanted to see if this was something the community was interested in.  I
> think it might be interesting to combine a project that integrates
> everything with the project the parses everything.  I also think having
> a camel-tika component might help achieve some of Tika's 2.0 goals.
>
>
> - Bob Paulin
>
>
> [1] https://tika.apache.org/
>
> [2] https://wiki.apache.org/tika/Tika2_0RoadMap
>
>


-- 
Sergey Beryozkin

Talend Community Coders
http://coders.talend.com/

Mime
View raw message