ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karthik Sarma <ksa...@ksarma.com>
Subject Re: RTF Annotator?
Date Tue, 03 Sep 2013 18:56:57 GMT
I think such a tool would be quite useful -- I imagine that David isn't the
only person who works with RTF docs, and avoiding conversion should help us
glean additional information as James suggests.

Let me know if you need my assistance with anything!





--
Karthik Sarma
UCLA Medical Scientist Training Program Class of 20??
Member, UCLA Medical Imaging & Informatics Lab
Member, CA Delegation to the House of Delegates of the American Medical
Association
ksarma@ksarma.com
gchat: ksarma@gmail.com
linkedin: www.linkedin.com/in/ksarma


On Tue, Sep 3, 2013 at 11:36 AM, Masanz, James J. <Masanz.James@mayo.edu>wrote:

> I think text formatting is a natural for being turned into annotations.
> Just one example - some people use formatting to indicate section headings
> and there could be a sectionizer that uses rtf tags as-is to determine
> sections, or uses them as features at least.
>
> -- James
>
> > -----Original Message-----
> > From: dev-return-1935-Masanz.James=mayo.edu@ctakes.apache.org [mailto:
> dev-
> > return-1935-Masanz.James=mayo.edu@ctakes.apache.org] On Behalf Of Pei
> Chen
> > Sent: Tuesday, September 03, 2013 9:10 AM
> > To: user@ctakes.apache.org; dev@ctakes.apache.org
> > Subject: Re: RTF Annotator?
> >
> > Hi David,
> > There is work being done on Tika/OCR integration, but I am not aware of
> > any cTAKES RTF Annotators.
> > What does others think? Having additional meta data such does sound very
> > interesting especially with mark-ups (bold/italics) and semi-structured
> > data such as tables...
> >
> > --Pei
> >
> >
> > On Sun, Sep 1, 2013 at 5:41 PM, David Kincaid
> > <kincaid.dave@gmail.com>wrote:
> >
> > > Before I embark on building an RTF annotator I thought I'd ask around
> > > a bit to see if anyone had built such a thing. Most of the medical
> > > notes I have to handle are in RTF format. I can pretty easily extract
> > > the text only using something like Apache TIka, but there is important
> > > information in the formatting as well (bold, italic, font sizes,
> > > centering, tables, etc) that I'd like to use. Is anyone aware of a UIMA
> > annotator that does this already?
> > >
> > > Thanks,
> > >
> > > Dave Kincaid
> > >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message