ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Finan, Sean" <Sean.Fi...@childrens.harvard.edu>
Subject RE: Offline access
Date Mon, 16 May 2016 15:43:59 GMT
Hi John,

There is a dictionary database builder in sandbox http://svn.apache.org/repos/asf/ctakes/sandbox/dictionary-gui/
Basically, throw it into your ide and launch the main class.  You will see the gui.  Point
to your umls root directory and your ctakes installation root.  Select your vocabularies.
 Select your tuis.  Enter a name for the dictionary and click "Go".

I would love to have some formal online documentation for this, but haven't had the time.
 Any help with that would be appreciated.

Sean

-----Original Message-----
From: John Travis Green [mailto:john.travis.green@gmail.com] 
Sent: Monday, May 16, 2016 11:36 AM
Cc: dev@ctakes.apache.org
Subject: Re: Offline access

It seems odd they would require repeated checke when you can download the whole thing for
input into a db with mmsys.  How is the documentation on building the fast lookup from a
local copy of the umls? A quick glance at the website didnt reveal much.  ytex reportedly
works this way but Ive been getting an error with the section annotator that others addressed
back in early 15 but no resolution was posted to the listserv.  Thanks all for your help
on this. I have an active irb here in the army using ctakes but dod security requirements
are so strict the entire server is offline.  Best, John  



Having a hosted web site which hosts UMLS resources and checks for a license before downloading
locally(via the NLM/UMLS license validation web service) is indeed the norm and a more common
way.  (YTEX originally and UMLS itself does this when one downloads their resources.)  The
responsibility for any UMLS license adherence afterwards is essentially done by the one downloading
it since the check is done at that time.  The online/upon initialization license check/bundled
solution for cTAKES was really done for convenience to end users and historical reasons. 
It just boils down to who wants to build, manage, maintain, host such a site that distributes
and ensure the license check before downloading for these specially formatted resources.



—Pei



> On May 16, 2016, at 9:12 AM, Finan, Sean <Sean.Finan@childrens.harvard.edu> wrote:

>  
> The agreement that the ctakes core group was able to achieve with the NLM (distributor
of UMLS) was that ctakes would check a user's access rights upon every use of any database
derived from the UMLS.  The reason for this was that the NLM did not want one valid UMLS user
to download the database and then distribute it for use by unaccredited third parties.  We
have stuck to that agreement.  Upon every initial load of either of the distributed ctakes
dictionary modules the user's entered password is checked online with the NLM user registry.

>  
> I think that a great compromise would be if somebody could create a "remote checkout"
tool, something that checks-out a virtual license for use while you are on the road.  Maybe
coordinate with the NLM on getting such a thing approved.  As ctakes is open source software
you could start toying with such a client first.  To start, delegate to JdbcRareWordDictionary
as does the UmlsJdbcRareWordDictionary, and delegate to JdbcConceptFactory as does the UmlsJdbcConceptFactory
(for the -fast module).  Then point to the new trial "remote checkout" classes in your .xml
setup file (the default being cTakesHsql.xml).  However, do NOT use these classes directly
or in any production scenario as that would not abide by our agreement with the NLM.  Do not
even check them into sandbox without us getting a new agreement to use such a system with
the NLM.  I must emphasize that publicly doing so could cause us to lose our privileges to
distribute a default dictionary.  You would still be able to download your own UMLS database
and create your own dictionary for use with ctakes, but not every user can do that.  And when
creating your client code favor composition over inheritance as the remote checkout client
should not have IS-A, not that I can enforce anything that you do.

>  
> I repeat, NEVER use the JdbcRareWordDictionary and/or JdbcConceptFactory directly unless
you are pointing to a database that was not created using the UMLS as a source.

>  
> Sean

>  
> -----Original Message-----

> From: Geise, Brandon D. [mailto:bdgeise@geisinger.edu]

> Sent: Monday, May 16, 2016 8:01 AM

> To: dev@ctakes.apache.org

> Subject: RE: Offline access

>  
> I haven't tried it and is a guess based on reading the code, but you might be able to
change the dictionary implementation name in the xml file from UmlsJdbcRareWordDictionary
to ConceptFactory, since Umls factory implements from ConceptFactory.

>  
> -----Original Message-----

> From: Miller, Timothy [mailto:Timothy.Miller@childrens.harvard.edu]

> Sent: Saturday, May 14, 2016 2:28 PM

> To: dev@ctakes.apache.org

> Subject: Re: Offline access

>  
> Well, before we had online verification ctakes required downloading umls, extracting
the right subset, and building a database for the dictionary tool. You can still do that -
it is often necessary for use cases that our default dictionary doesn't have coverage, and
while I'm not sure the state of documentation there have been several threads on the list
about it. I think if you do it this way you can skip the UMLS verification step (though I
don't remember exactly how that works) because you will have been verified at download time.

>  
> Can Sean or someone verify that this is true? If he builds his own dictionary (with the
same subsets) can he skip the online verification?

> Thanks

> Tim

>  
> ________________________________________

> From: John Travis Green <john.travis.green@gmail.com>

> Sent: Saturday, May 14, 2016 10:12 AM

> To: dev@ctakes.apache.org

> Subject: Offline access

>  
> I have a dod use case that requires offline umls verification. Anyone accomplish this
yet? I recall some chatter awhile back but initial flirtations with google were unsuccessful.
thanks!John

>  
>  
>  
>  
> IMPORTANT WARNING: The information in this message (and the documents attached to it,
if any) is confidential and may be legally privileged. It is intended solely for the addressee.
Access to this message by anyone else is unauthorized. If you are not the intended recipient,
any disclosure, copying, distribution or any action taken, or omitted to be taken, in reliance
on it is prohibited and may be unlawful. If you have received this message in error, please
delete all electronic copies of this message (and the documents attached to it, if any), destroy
any hard copies you may have created and notify me immediately by replying to this email.
Thank you.

>  
> Geisinger Health System utilizes an encryption process to safeguard Protected Health
Information and other confidential data contained in external e-mail messages. If email is
encrypted, the recipient will receive an e-mail instructing them to sign on to the Geisinger
Health System Secure E-mail Message Center to retrieve the encrypted e-mail.



Mime
View raw message