ctakes-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mattmann, Chris A (3980)" <chris.a.mattm...@jpl.nasa.gov>
Subject Re: Integration of Tika with cTAKES
Date Mon, 08 Jun 2015 05:49:35 GMT
Hi Sekhar,

[BCC to dev@tika.a.o to keep them in the loop]

Sure, you can do this with Tika and Tesseract. FYI:

http://wiki.apache.org/tika/TikaOCR/

Enjoy! :)

(pro tip: then check out: http://wiki.apache.org/tika/cTAKESParser
to see how to run cTAKES on the result with Tika)

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++




-----Original Message-----
From: <Hari>, Sekhar <sekhar.hari@cgi.com>
Reply-To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
Date: Sunday, June 7, 2015 at 10:27 PM
To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>,
"user@ctakes.apache.org" <user@ctakes.apache.org>
Subject: RE: Integration of Tika with cTAKES

>Hello Pei, all -
>
>I am looking to convert handwritten image documents (Ex: a physician's
>handwritten medical prescription) into a text format file. The image
>documents can be in a PDF, TIFF, GIF etc. formats. Can Tika or Tessaract
>do this? Can anybody share their experience about this? Also, if it is
>possible to do with Tika, request you to send me a step-by-step guide.
>
>Many thanks,
>Sekhar H.
>
>-----Original Message-----
>From: Chen, Pei [mailto:Pei.Chen@childrens.harvard.edu]
>Sent: Sunday, June 07, 2015 10:34 PM
>To: <dev@ctakes.apache.org>
>Subject: Re: Integration of Tika with cTAKES
>
>This looks awesome.
>Perhaps we can reuse the Tika server on the ctakes demo VM.
>
>Sent from my iPhone
>
>> On Jun 6, 2015, at 8:40 PM, jay vyas <jayunit100.apache@gmail.com>
>>wrote:
>> 
>> This is awesome; thanks!
>> 
>> For some of the new ctakes projects where fplks bc are aiming at using
>> it with big data tooling, the till abstraction might be super useful.
>> On Jun 6, 2015 8:19 PM, "Mattmann, Chris A (3980)" <
>> chris.a.mattmann@jpl.nasa.gov> wrote:
>> 
>>> Hey cTAKES peeps!
>>> 
>>> We went ahead and integrated Tika with cTAKES for a project I'm
>>> working on at JPL. It will be part of the 1.9 release of Tika. You
>>> can check it out here:
>>> 
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__wiki.apache.org_
>>> tika_cTAKESParser&d=BQIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppx
>>> eFU&r=huK2MFkj300qccT8OSuuoYhy_xEYujfPwiAxhPVz5WY&m=L070DL_WFb_1U_8jG
>>> dAbnv_Ggx5mnsTfV4Jba6oNNU8&s=vafA1g4UuwgflDIIfKBwceFE2mgCY3VVMJ_A1PaU
>>> PRM&e=
>>> 
>>> 
>>> Feedback welcomed. cTAKES is rad!
>>> 
>>> Cheers,
>>> Chris
>>> 
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> Chris Mattmann, Ph.D.
>>> Chief Architect
>>> Instrument Software and Science Data Systems Section (398) NASA Jet
>>> Propulsion Laboratory Pasadena, CA 91109 USA
>>> Office: 168-519, Mailstop: 168-527
>>> Email: chris.a.mattmann@nasa.gov
>>> WWW:  
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__sunset.usc.edu_-7
>>> Emattmann_&d=BQIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=h
>>> uK2MFkj300qccT8OSuuoYhy_xEYujfPwiAxhPVz5WY&m=L070DL_WFb_1U_8jGdAbnv_G
>>> gx5mnsTfV4Jba6oNNU8&s=gFv8mVTL-qCTpFgkWRIC8vlrkwOdiXHUWq2xtCUTI48&e=
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> Adjunct Associate Professor, Computer Science Department University
>>> of Southern California, Los Angeles, CA 90089 USA
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> 
>>> 
>>> 

Mime
View raw message