ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Finan, Sean" <Sean.Fi...@childrens.harvard.edu>
Subject RE: YTEX cTAKES 3.1.1 ready
Date Thu, 06 Feb 2014 18:01:57 GMT
Hi Vijay, 

>  I have yet to run across clinical text from a real EMR where newlines represent the
end of a sentence

Since James pointed out this possibility a couple weeks ago, I have kept my eyes open.  The
problem is pretty ubiquitous in a corpus that I'm working with right now.  I just opened the
first note and gave it a count ... 95 lines total, 9 are sentence/phrase (lacking punctuation)
endings.  This is not including lists, which comprise about half of the note.
One possible conjoinment was "Will consider [...] biopsy\nGiven [...]".  Depending upon how
cTakes deals with it, the meaning could change drastically.

> I believe cTAKES absolutely has to support sentences with newlines within them

Yes, cTakes should do so, but I hope that you aren't suggesting that it only support such
a structure.

Where is that easy button?

-----Original Message-----
From: vijay garla [mailto:vngarla@gmail.com] 
Sent: Thursday, February 06, 2014 10:31 AM
To: dev@ctakes.apache.org
Cc: ytex-users@googlegroups.com; ctakes-dev@incubator.apache.org; vlad.valtchinov@gmail.com
Subject: Re: YTEX cTAKES 3.1.1 ready

I believe it is worth migrating to trunk.

Note that the sentence detector is also complementary - the existing ctakes sentence detector
is unchanged - users can choose which sentence detector to use.  There are changes to assertion
& dependency parsing to support sentences without newlines, and that works with both sentence
detectors.

I believe cTAKES absolutely has to support sentences with newlines within them - I have yet
to run across clinical text from a real EMR where newlines represent the end of a sentence
- the changes to assertion & dependency parsing will have to be done at some point.

-vj


On Thu, Feb 6, 2014 at 10:19 AM, Chen, Pei
<Pei.Chen@childrens.harvard.edu>wrote:

> VJ,
> Aside from the changes to the existing cTAKES code (sentence detector,
> etc.) [which we could leave out if it's still being debated], Do you 
> think it's worth migrating the ytex code to trunk at this point?
>  As you mentioned earlier, it's largely complementary.
> [I was just thinking of saving effort to maintain the separate branch 
> and for simplicity for dev...]
>
> --Pei
>
> > -----Original Message-----
> > From: vijay garla [mailto:vngarla@gmail.com]
> > Sent: Wednesday, February 05, 2014 9:30 PM
> > To: ytex-users@googlegroups.com; ctakes-dev@incubator.apache.org; 
> > vlad.valtchinov@gmail.com
> > Subject: Re: YTEX cTAKES 3.1.1 ready
> >
> > Hi Vlad,
> >
> > I Updated the umls install guide; see
> > https://code.google.com/p/ytex/wiki/UMLS_SQL_SERVER_3_1
> >
> > I would prefer to add the docs in the ctakes confluence, but as far 
> > as I
> can
> > tell, I don't have write access there - can somebody give me write
> privileges
> > on the ctakes confluence site?
> >
> > There was a bug in the umls install; copy
> > https://svn.apache.org/repos/asf/ctakes/branches/ytex/ctakes-
> > ytex/scripts/data/build.xmlover
> > the corresponding file in your ctakes-3.1.2 install
> > (CTAKES_HOME\bin\ctakes-ytex\scripts\data) and you should be set.  
> > The import is currently running on the UMLS 2013AA (I assume this 
> > will
> complete
> > without issues as long as the umls schema hasn't changed from 2012).
> >
> > what trial and error did you have to go through to build the distro?
> >
> > -vj
> >
> >
> > On Wed, Feb 5, 2014 at 5:33 PM, vijay garla <vngarla@gmail.com> wrote:
> >
> > > Hi Vlad,
> > >
> > > sorry that the instructions aren't clear.
> > >
> > > re 1) What I am trying to say is install 
> > > apache-ctakes-3.2.0-snapshot as usual (this is unchanged from 
> > > 3.1.1).  After that you still have to apply the lib and resources 
> > > (these are things that cannot be distributed via apache).
> > >
> > > re 2) Yes, I need to update those docs.  Hopefully will get to 
> > > that at some point.  However, I assume you already have a UMLS DB 
> > > (also assume SQL Server).  If you can't/don't want to use your 
> > > existing umls DB, please tell me.  The I'll priortize upgrading 
> > > the doc on importing the umls tables (the scripts are there).
> > >
> > > best,
> > >
> > > VJ
> > >
> > >
> > > On Wed, Feb 5, 2014 at 4:44 PM, <vlad.valtchinov@gmail.com> wrote:
> > >
> > >> Hi VJ-
> > >>
> > >> so, with trial and error were able to make the distribution and 
> > >> now have the apache-ctakes-3.1.2-SNAPSHOT-bin.zip archive.
> > >>
> > >> Here's what's unclear.
> > >>
> > >> 1. Is now this the only (combined) thing that you need for ctakes
> > >> 3.1.1 + Ytex?
> > >> the current documentation (https://code.google.com/p/yte 
> > >> x/wiki/Installation_cTAKES_3_1?ts=1388793998&updated=Instal
> > >> lation_cTAKES_3_1)
> > >> which most probably is outdated, talks about installing cTakes 
> > >> 3.1.1 first and then applying 2 SNAPSHOT archives (downloadable) 
> > >> , lib and resources.
> > >> This is a confusion point.
> > >>
> > >> 2. The directions to import UMLS subset are then outdated as well.
> > >> Maybe one should use the old version (ctakes 2.5 and ytex 0.8) to 
> > >> import the RRF files for the UMLS subset and then just use the 
> > >> resulting db. Thoughts?
> > >>
> > >> Thanks,
> > >> Vlad Valtchinov
> > >> Brigham Rad
> > >>
> > >>
> > >> On Thursday, January 30, 2014 5:17:43 PM UTC-5, vijay garla wrote:
> > >>
> > >>> Hi Vlad,
> > >>>
> > >>>
> > >> All of ytex has been moved into ctakes, it is currently in a 
> > >> branch (
> > >>> https://svn.apache.org/repos/asf/ctakes/branches/ytex).  You 
> > >>> don't have to install ytex-0.8 - instead you will have to build 
> > >>> and install from the ytex branch to create your own 
> > >>> distribution.  Steps
> 2 & 3
> > are correct.
> > >>>
> > >>> Although it is a pain, if you have the jdk, maven, and svn, you 
> > >>> can easily build your own distro:
> > >>> * open a command prompt
> > >>> * make sure jdk, maven, and svn are in your path
> > >>> * cd to some directory where you want to check stuff out (I like
> > >>> c:\temp)
> > >>> * run the following commands
> > >>> rmdir /s /q ctakes
> > >>> svn co https://svn.apache.org/repos/asf/ctakes/branches/ytex 
> > >>> ctakes cd ctakes mvn clean install -DskipTests
> > >>>
> > >>> And you will have the ctakes (with ytex) distro in 
> > >>> ctakes\ctakes-distribution\target\apache-ctakes-3.1.2-SNAPSHOT-b
> > >>> in.z
> > >>> ip
> > >>>
> > >>> What is the process for getting the ytex branch merged into trunk?
> > >>> As I mentioned, there are very few changes to other ctakes 
> > >>> classes/types - this should be completely complementary and not 
> > >>> affect any existing ctakes functionality.
> > >>>
> > >>> -vj
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> On Thu, Jan 30, 2014 at 4:56 PM, <vlad.va...@gmail.com> wrote:
> > >>>
> > >>>> Hi VJ--
> > >>>>
> > >>>> this is great!! Thanks for all the hard work on it!
> > >>>>
> > >>>> We're starting to look into the new install. For now we're 
> > >>>> trying the binaries out.
> > >>>>
> > >>>> There were these questions about the proper install steps:
> > >>>>
> > >>>> 1. Do we first install ytex-0.8 2. Then install the new cTakes

> > >>>> 3.1.1 instance and also apply the SNAPSHOT lib and resources 
> > >>>> zips 3. Work our way to install the UMLS ontologies in the db
> > >>>>
> > >>>> Its is not entirely clear from the new document ( 
> > >>>> https://code.google.com/p/ytex/wiki/Installation_cTAKES_3_
> > >>>> 1?ts=1388793998&updated=Installation_cTAKES_3_1)
> > >>>> if there's still need to install ytex-0.8, or YTEX has been 
> > >>>> entirely merged into cTakes?
> > >>>>
> > >>>> If the last statement is correct, there are missing parts in 
> > >>>> i.e the UMLS install steps that are linked from the new ctakes

> > >>>> 3.1.1
> > document.
> > >>>>
> > >>>> Thanks,
> > >>>> vlad
> > >>>>
> > >>>>
> > >>>> On Friday, January 3, 2014 10:21:52 PM UTC-5, vijay garla wrote:
> > >>>>>
> > >>>>> Hello All,
> > >>>>>
> > >>>>> I have finished an initial cut at the port of YTEX to cTAKES
3.1.1.
> > >>>>>  Most of the YTEX functionality has been ported and integrated

> > >>>>> with cTAKES, and I've tested with MySQL and MS SQL Server 
> > >>>>> (oracle
> > tests pending).
> > >>>>>
> > >>>>> Most of the changes were made in new projects - very little

> > >>>>> existing cTAKES code has been modified.  The only non-trivial

> > >>>>> changes are in
> > >>>>> /ctakes-
> > assertion/src/main/java/org/apache/ctakes/assertion/medfac
> > >>>>> ts/i2b2/api
> > >>>>> - here I modified 
> > >>>>> CharacterOffsetToLineTokenConverterCtakesImpl & 
> > >>>>> SingleDocumentProcessorCtakes to deal with newlines within

> > >>>>> sentences correctly.  Can somebody take a look at the changes

> > >>>>> in
> the
> > ytex branch?
> > >>>>>
> > >>>>> I believe that the branch https://svn.apache.org/ 
> > >>>>> repos/asf/ctakes/branches/ytex is ready to be merged into 
> > >>>>> ctakes trunk, but would like other users to test it as well.
 Questions:
> > >>>>>
> > >>>>> * How can I distribute the ctakes binary distribution to ytex

> > >>>>> users before the merge? Can we make the branch build available

> > >>>>> somewhere?  The binary distribution is too large to host on

> > >>>>> the ytex google code site (max
> > >>>>> 200 MB)
> > >>>>> * Non-ASF libraries - I have segregated these out into their

> > >>>>> own zip file that can be distributed via sourceforge.  As a

> > >>>>> stopgap, I can upload this to the ytex google code site, but

> > >>>>> would prefer to upload to sourceforge.
> > >>>>> * UMLS Derivatives - Ditto for these - would like to move to

> > >>>>> sourceforge.
> > >>>>> * Documentation - How can I update the confluence docs?  I

> > >>>>> would migrate the documentation from the google code website.
> > >>>>>
> > >>>>> Here the installation instructions (putting the wagon in front

> > >>>>> of the horse ...)
> > >>>>>
> > >>>>> https://code.google.com/p/ytex/wiki/Installation_cTAKES_3_1?
> > >>>>> ts=1388793998&updated=Installation_cTAKES_3_1
> > >>>>>
> > >>>>> Best,
> > >>>>>
> > >>>>> VJ
> > >>>>>
> > >>>>>
> > >>>>>  --
> > >>>> You received this message because you are subscribed to the 
> > >>>> Google Groups "ytex-users" group.
> > >>>> To unsubscribe from this group and stop receiving emails from 
> > >>>> it, send an email to ytex-users+...@googlegroups.com.
> > >>>> To post to this group, send email to ytex-...@googlegroups.com.
> > >>>> To view this discussion on the web visit 
> > >>>> https://groups.google.com/d/ 
> > >>>> msgid/ytex-users/70f03a80-ce1a-4c0e-b35d-5116d1c93ea0%
> > >>>> 40googlegroups.com.
> > >>>>
> > >>>> For more options, visit https://groups.google.com/groups/opt_out.
> > >>>>
> > >>>
> > >>>  --
> > >> You received this message because you are subscribed to the 
> > >> Google Groups "ytex-users" group.
> > >> To unsubscribe from this group and stop receiving emails from it, 
> > >> send an email to ytex-users+unsubscribe@googlegroups.com.
> > >> To post to this group, send email to ytex-users@googlegroups.com.
> > >> To view this discussion on the web visit
> > >> https://groups.google.com/d/msgid/ytex-users/bc3bd705-55d2-4acd-
> > a273-
> > >> a3b1a7b36241%40googlegroups.com
> > >> .
> > >>
> > >> For more options, visit https://groups.google.com/groups/opt_out.
> > >>
> > >
> > >
>

Mime
View raw message