ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Green <john.travis.gr...@gmail.com>
Subject Re: YTEX cTAKES 3.1.1 ready
Date Fri, 07 Feb 2014 12:07:01 GMT
Completely non-contributory, but it is odd/humorous to see the headaches
that quickly written notes we do in the 5 minutes post-encounter lead to in
free-text analysis.

JG


On Thu, Feb 6, 2014 at 1:27 PM, Finan, Sean <
Sean.Finan@childrens.harvard.edu> wrote:

> Right, got it.  I just wanted to let you know that some EMR notes -do-
> require sentence splitting at newline characters.
>
> -----Original Message-----
> From: vijay garla [mailto:vngarla@gmail.com]
> Sent: Thursday, February 06, 2014 1:06 PM
> To: dev@ctakes.apache.org
> Cc: ytex-users@googlegroups.com; ctakes-dev@incubator.apache.org;
> vlad.valtchinov@gmail.com
> Subject: Re: YTEX cTAKES 3.1.1 ready
>
> The cTAKES sentence detector is not changed in the YTEX branch.  The YTEX
> branch has an *additional* sentence detector that does not automatically
> split sentences on newlines - users can use this if they like.
>
> -vj
>
>
> On Thu, Feb 6, 2014 at 1:01 PM, Finan, Sean <
> Sean.Finan@childrens.harvard.edu> wrote:
>
> > Hi Vijay,
> >
> > >  I have yet to run across clinical text from a real EMR where
> > > newlines
> > represent the end of a sentence
> >
> > Since James pointed out this possibility a couple weeks ago, I have
> > kept my eyes open.  The problem is pretty ubiquitous in a corpus that
> > I'm working with right now.  I just opened the first note and gave it
> > a count ... 95 lines total, 9 are sentence/phrase (lacking punctuation)
> endings.
> >  This is not including lists, which comprise about half of the note.
> > One possible conjoinment was "Will consider [...] biopsy\nGiven [...]".
> >  Depending upon how cTakes deals with it, the meaning could change
> > drastically.
> >
> > > I believe cTAKES absolutely has to support sentences with newlines
> > within them
> >
> > Yes, cTakes should do so, but I hope that you aren't suggesting that
> > it only support such a structure.
> >
> > Where is that easy button?
> >
> > -----Original Message-----
> > From: vijay garla [mailto:vngarla@gmail.com]
> > Sent: Thursday, February 06, 2014 10:31 AM
> > To: dev@ctakes.apache.org
> > Cc: ytex-users@googlegroups.com; ctakes-dev@incubator.apache.org;
> > vlad.valtchinov@gmail.com
> > Subject: Re: YTEX cTAKES 3.1.1 ready
> >
> > I believe it is worth migrating to trunk.
> >
> > Note that the sentence detector is also complementary - the existing
> > ctakes sentence detector is unchanged - users can choose which
> > sentence detector to use.  There are changes to assertion & dependency
> > parsing to support sentences without newlines, and that works with
> > both sentence detectors.
> >
> > I believe cTAKES absolutely has to support sentences with newlines
> > within them - I have yet to run across clinical text from a real EMR
> > where newlines represent the end of a sentence - the changes to
> > assertion & dependency parsing will have to be done at some point.
> >
> > -vj
> >
> >
> > On Thu, Feb 6, 2014 at 10:19 AM, Chen, Pei
> > <Pei.Chen@childrens.harvard.edu>wrote:
> >
> > > VJ,
> > > Aside from the changes to the existing cTAKES code (sentence
> > > detector,
> > > etc.) [which we could leave out if it's still being debated], Do you
> > > think it's worth migrating the ytex code to trunk at this point?
> > >  As you mentioned earlier, it's largely complementary.
> > > [I was just thinking of saving effort to maintain the separate
> > > branch and for simplicity for dev...]
> > >
> > > --Pei
> > >
> > > > -----Original Message-----
> > > > From: vijay garla [mailto:vngarla@gmail.com]
> > > > Sent: Wednesday, February 05, 2014 9:30 PM
> > > > To: ytex-users@googlegroups.com; ctakes-dev@incubator.apache.org;
> > > > vlad.valtchinov@gmail.com
> > > > Subject: Re: YTEX cTAKES 3.1.1 ready
> > > >
> > > > Hi Vlad,
> > > >
> > > > I Updated the umls install guide; see
> > > > https://code.google.com/p/ytex/wiki/UMLS_SQL_SERVER_3_1
> > > >
> > > > I would prefer to add the docs in the ctakes confluence, but as
> > > > far as I
> > > can
> > > > tell, I don't have write access there - can somebody give me write
> > > privileges
> > > > on the ctakes confluence site?
> > > >
> > > > There was a bug in the umls install; copy
> > > > https://svn.apache.org/repos/asf/ctakes/branches/ytex/ctakes-
> > > > ytex/scripts/data/build.xmlover
> > > > the corresponding file in your ctakes-3.1.2 install
> > > > (CTAKES_HOME\bin\ctakes-ytex\scripts\data) and you should be set.
> > > > The import is currently running on the UMLS 2013AA (I assume this
> > > > will
> > > complete
> > > > without issues as long as the umls schema hasn't changed from 2012).
> > > >
> > > > what trial and error did you have to go through to build the distro?
> > > >
> > > > -vj
> > > >
> > > >
> > > > On Wed, Feb 5, 2014 at 5:33 PM, vijay garla <vngarla@gmail.com>
> wrote:
> > > >
> > > > > Hi Vlad,
> > > > >
> > > > > sorry that the instructions aren't clear.
> > > > >
> > > > > re 1) What I am trying to say is install
> > > > > apache-ctakes-3.2.0-snapshot as usual (this is unchanged from
> > > > > 3.1.1).  After that you still have to apply the lib and
> > > > > resources (these are things that cannot be distributed via apache).
> > > > >
> > > > > re 2) Yes, I need to update those docs.  Hopefully will get to
> > > > > that at some point.  However, I assume you already have a UMLS
> > > > > DB (also assume SQL Server).  If you can't/don't want to use
> > > > > your existing umls DB, please tell me.  The I'll priortize
> > > > > upgrading the doc on importing the umls tables (the scripts are
> there).
> > > > >
> > > > > best,
> > > > >
> > > > > VJ
> > > > >
> > > > >
> > > > > On Wed, Feb 5, 2014 at 4:44 PM, <vlad.valtchinov@gmail.com>
wrote:
> > > > >
> > > > >> Hi VJ-
> > > > >>
> > > > >> so, with trial and error were able to make the distribution and
> > > > >> now have the apache-ctakes-3.1.2-SNAPSHOT-bin.zip archive.
> > > > >>
> > > > >> Here's what's unclear.
> > > > >>
> > > > >> 1. Is now this the only (combined) thing that you need for
> > > > >> ctakes
> > > > >> 3.1.1 + Ytex?
> > > > >> the current documentation (https://code.google.com/p/yte
> > > > >> x/wiki/Installation_cTAKES_3_1?ts=1388793998&updated=Instal
> > > > >> lation_cTAKES_3_1)
> > > > >> which most probably is outdated, talks about installing cTakes
> > > > >> 3.1.1 first and then applying 2 SNAPSHOT archives
> > > > >> (downloadable) , lib and resources.
> > > > >> This is a confusion point.
> > > > >>
> > > > >> 2. The directions to import UMLS subset are then outdated as
well.
> > > > >> Maybe one should use the old version (ctakes 2.5 and ytex 0.8)
> > > > >> to import the RRF files for the UMLS subset and then just use
> > > > >> the resulting db. Thoughts?
> > > > >>
> > > > >> Thanks,
> > > > >> Vlad Valtchinov
> > > > >> Brigham Rad
> > > > >>
> > > > >>
> > > > >> On Thursday, January 30, 2014 5:17:43 PM UTC-5, vijay garla wrote:
> > > > >>
> > > > >>> Hi Vlad,
> > > > >>>
> > > > >>>
> > > > >> All of ytex has been moved into ctakes, it is currently in a
> > > > >> branch (
> > > > >>> https://svn.apache.org/repos/asf/ctakes/branches/ytex). 
You
> > > > >>> don't have to install ytex-0.8 - instead you will have to
> > > > >>> build and install from the ytex branch to create your own
> > > > >>> distribution.  Steps
> > > 2 & 3
> > > > are correct.
> > > > >>>
> > > > >>> Although it is a pain, if you have the jdk, maven, and svn,
> > > > >>> you can easily build your own distro:
> > > > >>> * open a command prompt
> > > > >>> * make sure jdk, maven, and svn are in your path
> > > > >>> * cd to some directory where you want to check stuff out
(I
> > > > >>> like
> > > > >>> c:\temp)
> > > > >>> * run the following commands
> > > > >>> rmdir /s /q ctakes
> > > > >>> svn co https://svn.apache.org/repos/asf/ctakes/branches/ytex
> > > > >>> ctakes cd ctakes mvn clean install -DskipTests
> > > > >>>
> > > > >>> And you will have the ctakes (with ytex) distro in
> > > > >>> ctakes\ctakes-distribution\target\apache-ctakes-3.1.2-SNAPSHOT
> > > > >>> -b
> > > > >>> in.z
> > > > >>> ip
> > > > >>>
> > > > >>> What is the process for getting the ytex branch merged into
> trunk?
> > > > >>> As I mentioned, there are very few changes to other ctakes
> > > > >>> classes/types - this should be completely complementary and
> > > > >>> not affect any existing ctakes functionality.
> > > > >>>
> > > > >>> -vj
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>> On Thu, Jan 30, 2014 at 4:56 PM, <vlad.va...@gmail.com>
wrote:
> > > > >>>
> > > > >>>> Hi VJ--
> > > > >>>>
> > > > >>>> this is great!! Thanks for all the hard work on it!
> > > > >>>>
> > > > >>>> We're starting to look into the new install. For now
we're
> > > > >>>> trying the binaries out.
> > > > >>>>
> > > > >>>> There were these questions about the proper install steps:
> > > > >>>>
> > > > >>>> 1. Do we first install ytex-0.8 2. Then install the new
> > > > >>>> cTakes
> > > > >>>> 3.1.1 instance and also apply the SNAPSHOT lib and resources
> > > > >>>> zips 3. Work our way to install the UMLS ontologies in
the db
> > > > >>>>
> > > > >>>> Its is not entirely clear from the new document (
> > > > >>>> https://code.google.com/p/ytex/wiki/Installation_cTAKES_3_
> > > > >>>> 1?ts=1388793998&updated=Installation_cTAKES_3_1)
> > > > >>>> if there's still need to install ytex-0.8, or YTEX has
been
> > > > >>>> entirely merged into cTakes?
> > > > >>>>
> > > > >>>> If the last statement is correct, there are missing parts
in
> > > > >>>> i.e the UMLS install steps that are linked from the new
> > > > >>>> ctakes
> > > > >>>> 3.1.1
> > > > document.
> > > > >>>>
> > > > >>>> Thanks,
> > > > >>>> vlad
> > > > >>>>
> > > > >>>>
> > > > >>>> On Friday, January 3, 2014 10:21:52 PM UTC-5, vijay garla
wrote:
> > > > >>>>>
> > > > >>>>> Hello All,
> > > > >>>>>
> > > > >>>>> I have finished an initial cut at the port of YTEX
to cTAKES
> > 3.1.1.
> > > > >>>>>  Most of the YTEX functionality has been ported and
> > > > >>>>> integrated with cTAKES, and I've tested with MySQL
and MS
> > > > >>>>> SQL Server (oracle
> > > > tests pending).
> > > > >>>>>
> > > > >>>>> Most of the changes were made in new projects - very
little
> > > > >>>>> existing cTAKES code has been modified.  The only
> > > > >>>>> non-trivial changes are in
> > > > >>>>> /ctakes-
> > > > assertion/src/main/java/org/apache/ctakes/assertion/medfac
> > > > >>>>> ts/i2b2/api
> > > > >>>>> - here I modified
> > > > >>>>> CharacterOffsetToLineTokenConverterCtakesImpl &
> > > > >>>>> SingleDocumentProcessorCtakes to deal with newlines
within
> > > > >>>>> sentences correctly.  Can somebody take a look at
the
> > > > >>>>> changes in
> > > the
> > > > ytex branch?
> > > > >>>>>
> > > > >>>>> I believe that the branch https://svn.apache.org/
> > > > >>>>> repos/asf/ctakes/branches/ytex is ready to be merged
into
> > > > >>>>> ctakes trunk, but would like other users to test
it as well.
> >  Questions:
> > > > >>>>>
> > > > >>>>> * How can I distribute the ctakes binary distribution
to
> > > > >>>>> ytex users before the merge? Can we make the branch
build
> > > > >>>>> available somewhere?  The binary distribution is
too large
> > > > >>>>> to host on the ytex google code site (max
> > > > >>>>> 200 MB)
> > > > >>>>> * Non-ASF libraries - I have segregated these out
into their
> > > > >>>>> own zip file that can be distributed via sourceforge.
 As a
> > > > >>>>> stopgap, I can upload this to the ytex google code
site, but
> > > > >>>>> would prefer to upload to sourceforge.
> > > > >>>>> * UMLS Derivatives - Ditto for these - would like
to move to
> > > > >>>>> sourceforge.
> > > > >>>>> * Documentation - How can I update the confluence
docs?  I
> > > > >>>>> would migrate the documentation from the google code
website.
> > > > >>>>>
> > > > >>>>> Here the installation instructions (putting the wagon
in
> > > > >>>>> front of the horse ...)
> > > > >>>>>
> > > > >>>>> https://code.google.com/p/ytex/wiki/Installation_cTAKES_3_1?
> > > > >>>>> ts=1388793998&updated=Installation_cTAKES_3_1
> > > > >>>>>
> > > > >>>>> Best,
> > > > >>>>>
> > > > >>>>> VJ
> > > > >>>>>
> > > > >>>>>
> > > > >>>>>  --
> > > > >>>> You received this message because you are subscribed
to the
> > > > >>>> Google Groups "ytex-users" group.
> > > > >>>> To unsubscribe from this group and stop receiving emails
from
> > > > >>>> it, send an email to ytex-users+...@googlegroups.com.
> > > > >>>> To post to this group, send email to ytex-...@googlegroups.com.
> > > > >>>> To view this discussion on the web visit
> > > > >>>> https://groups.google.com/d/
> > > > >>>> msgid/ytex-users/70f03a80-ce1a-4c0e-b35d-5116d1c93ea0%
> > > > >>>> 40googlegroups.com.
> > > > >>>>
> > > > >>>> For more options, visit
> https://groups.google.com/groups/opt_out.
> > > > >>>>
> > > > >>>
> > > > >>>  --
> > > > >> You received this message because you are subscribed to the
> > > > >> Google Groups "ytex-users" group.
> > > > >> To unsubscribe from this group and stop receiving emails from
> > > > >> it, send an email to ytex-users+unsubscribe@googlegroups.com.
> > > > >> To post to this group, send email to ytex-users@googlegroups.com.
> > > > >> To view this discussion on the web visit
> > > > >> https://groups.google.com/d/msgid/ytex-users/bc3bd705-55d2-4acd
> > > > >> -
> > > > a273-
> > > > >> a3b1a7b36241%40googlegroups.com .
> > > > >>
> > > > >> For more options, visit https://groups.google.com/groups/opt_out.
> > > > >>
> > > > >
> > > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message