camel-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Mattmann <mattm...@apache.org>
Subject Re: Apache Tika Component
Date Mon, 23 Jan 2017 04:49:09 GMT
Great job, Bob! ☺



On 1/22/17, 8:17 PM, "Bob Paulin" <bob@apache.org> wrote:

    Hi,
    
    I'd like to propose an Apache Tika[1] connector for Apache Camel.  I see
    Camel uses a number of Tika components like PDFBox but it could be
    interesting to have a full assortment of file parsers to convert files
    to text.
    
    The basic configuration would allow MIME type detection and parsing
    files to text. 
    
    tika:detect
    
    File/Inputstream -> camel-tika -> MIME Type
    
    tika:parse
    
    File/Inputstream ->  camel-tika -> OutputStream in text
    
    I have a basic implementation that I'd be happy to send in a PR but I
    wanted to see if this was something the community was interested in.  I
    think it might be interesting to combine a project that integrates
    everything with the project the parses everything.  I also think having
    a camel-tika component might help achieve some of Tika's 2.0 goals.
    
    
    - Bob Paulin
    
    
    [1] https://tika.apache.org/
    
    [2] https://wiki.apache.org/tika/Tika2_0RoadMap
    
    
    



Mime
View raw message