ctakes-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From seanfi...@apache.org
Subject svn commit: r1591988 - in /ctakes/sandbox/dictionarytool: doc/ resource/ resource/cachedbtemplate/ resource/memdbtemplate/
Date Fri, 02 May 2014 17:57:06 GMT
Author: seanfinan
Date: Fri May  2 17:57:05 2014
New Revision: 1591988

URL: http://svn.apache.org/r1591988
Log:
added template hsql dictionaries

Added:
    ctakes/sandbox/dictionarytool/resource/
    ctakes/sandbox/dictionarytool/resource/README.txt   (with props)
    ctakes/sandbox/dictionarytool/resource/cachedbtemplate/
    ctakes/sandbox/dictionarytool/resource/cachedbtemplate/ctakesumls.properties   (with props)
    ctakes/sandbox/dictionarytool/resource/cachedbtemplate/ctakesumls.script   (with props)
    ctakes/sandbox/dictionarytool/resource/cachedbtemplate/sqltool.rc   (with props)
    ctakes/sandbox/dictionarytool/resource/memdbtemplate/
    ctakes/sandbox/dictionarytool/resource/memdbtemplate/ctakesumls.properties   (with props)
    ctakes/sandbox/dictionarytool/resource/memdbtemplate/ctakesumls.script   (with props)
    ctakes/sandbox/dictionarytool/resource/memdbtemplate/sqltool.rc   (with props)
Modified:
    ctakes/sandbox/dictionarytool/doc/howto.txt

Modified: ctakes/sandbox/dictionarytool/doc/howto.txt
URL: http://svn.apache.org/viewvc/ctakes/sandbox/dictionarytool/doc/howto.txt?rev=1591988&r1=1591987&r2=1591988&view=diff
==============================================================================
--- ctakes/sandbox/dictionarytool/doc/howto.txt (original)
+++ ctakes/sandbox/dictionarytool/doc/howto.txt Fri May  2 17:57:05 2014
@@ -44,6 +44,10 @@ Also remember that hsqldb requires the e
 It is recommended that the defaults are used, but you are welcome to experiment with your
own.
 
 
+If you are unfamiliar with hsqldb, there are two template / starting point databases in the
resource/ directory.
+cacheddbtemplate/ contains a template for a disk-cached dictionary, and memdbtemplate one
for a fully in-memory dictionary.
+Using an in-memory dictionary is orders of magnitude faster than using a disk-cached, but
not a good idea for very large (.5GB?) databases.
+
 
 There are a few other toys that can be found by perusing the source, such as a tool that
creates a mapping of codes 
 for like terms in different dictionaries:

Added: ctakes/sandbox/dictionarytool/resource/README.txt
URL: http://svn.apache.org/viewvc/ctakes/sandbox/dictionarytool/resource/README.txt?rev=1591988&view=auto
==============================================================================
--- ctakes/sandbox/dictionarytool/resource/README.txt (added)
+++ ctakes/sandbox/dictionarytool/resource/README.txt Fri May  2 17:57:05 2014
@@ -0,0 +1,27 @@
+This directory contains templates for two types of HSQL databases.
+These templates can be used as starting points for creating an HSQL database dictionary
+with a rare-word index using the input parameters
+-db   Output Database Url
+-tbl  Output Database Table
+
+cacheDbTemplate/  contains the template .script and .properties files for a database that
will be stored
+on disk with row caching for only the most frequent hits.
+All data is stored in one format and translated on-demand, which slows down the lookup.
+
+memDbTemplate/    contains the template .script and .properties files for a database that
will be stored
+completely in JVM memory upon startup.  Data is stored in pojos and no translation is performed
on lookup.
+In-memory databases are significantly faster than those referenced disk-cached, but should
not be used if
+the dictionary is significantly large. This may mean >> 1 million rows, but depends
upon your available memory.
+
+Both databases are named cTakesUmls and are set up to contain one rare-word table named ctakes_umls.
+The .properties file contains a line with readonly=false.  Leave this property as false (read/write)
until you have
+finished populating the table, then set it to readonly=true, which (supposedly) increases
lookup speed.
+
+The sqltool.rc files may be used to inspect the databases manually (sql) using the hsqldb_[version].jar
with the command:
+java -cp hsqldb_[version].jar org.hsqldb.util.SqlTool --rcfile sqltool.rc cTakesUmls
+The id "cTakesUmls" is case-sensitive and you may need to edit the .rc file to point to the
database directory.
+
+
+
+
+

Propchange: ctakes/sandbox/dictionarytool/resource/README.txt
------------------------------------------------------------------------------
    svn:eol-style = native

Added: ctakes/sandbox/dictionarytool/resource/cachedbtemplate/ctakesumls.properties
URL: http://svn.apache.org/viewvc/ctakes/sandbox/dictionarytool/resource/cachedbtemplate/ctakesumls.properties?rev=1591988&view=auto
==============================================================================
--- ctakes/sandbox/dictionarytool/resource/cachedbtemplate/ctakesumls.properties (added)
+++ ctakes/sandbox/dictionarytool/resource/cachedbtemplate/ctakesumls.properties Fri May 
2 17:57:05 2014
@@ -0,0 +1,17 @@
+#HSQL Database Engine 1.8.0.10
+#Fri Jan 24 14:13:18 EST 2014
+hsqldb.script_format=0
+runtime.gc_interval=0
+sql.enforce_strict_size=false
+hsqldb.cache_size_scale=8
+readonly=false
+hsqldb.nio_data_file=true
+hsqldb.cache_scale=14
+version=1.8.0
+hsqldb.default_table_type=memory
+hsqldb.cache_file_scale=1
+hsqldb.log_size=200
+modified=no
+hsqldb.cache_version=1.7.0
+hsqldb.original_version=1.8.0
+hsqldb.compatible_version=1.8.0

Propchange: ctakes/sandbox/dictionarytool/resource/cachedbtemplate/ctakesumls.properties
------------------------------------------------------------------------------
    svn:eol-style = native

Added: ctakes/sandbox/dictionarytool/resource/cachedbtemplate/ctakesumls.script
URL: http://svn.apache.org/viewvc/ctakes/sandbox/dictionarytool/resource/cachedbtemplate/ctakesumls.script?rev=1591988&view=auto
==============================================================================
--- ctakes/sandbox/dictionarytool/resource/cachedbtemplate/ctakesumls.script (added)
+++ ctakes/sandbox/dictionarytool/resource/cachedbtemplate/ctakesumls.script Fri May  2 17:57:05
2014
@@ -0,0 +1,6 @@
+CREATE SCHEMA PUBLIC AUTHORIZATION DBA
+CREATE CACHED TABLE CTAKES_UMLS(CUI VARCHAR_IGNORECASE(12),TUI VARCHAR_IGNORECASE(48),RINDEX
INTEGER,TCOUNT INTEGER,TEXT VARCHAR_IGNORECASE(255),RWORD VARCHAR_IGNORECASE(48))
+CREATE INDEX IDX_CTAKES_UMLS ON CTAKES_UMLS(RWORD)
+CREATE USER SA PASSWORD ""
+GRANT DBA TO SA
+SET WRITE_DELAY 10

Propchange: ctakes/sandbox/dictionarytool/resource/cachedbtemplate/ctakesumls.script
------------------------------------------------------------------------------
    svn:eol-style = native

Added: ctakes/sandbox/dictionarytool/resource/cachedbtemplate/sqltool.rc
URL: http://svn.apache.org/viewvc/ctakes/sandbox/dictionarytool/resource/cachedbtemplate/sqltool.rc?rev=1591988&view=auto
==============================================================================
--- ctakes/sandbox/dictionarytool/resource/cachedbtemplate/sqltool.rc (added)
+++ ctakes/sandbox/dictionarytool/resource/cachedbtemplate/sqltool.rc Fri May  2 17:57:05
2014
@@ -0,0 +1,138 @@
+# $Id: sqltool.rc,v 1.22 2007/08/09 03:22:21 unsaved Exp $
+
+# This is a sample RC configuration file used by SqlTool, DatabaseManager,
+# and any other program that uses the org.hsqldb.util.RCData class.
+
+# You can run SqlTool right now by copying this file to your home directory
+# and running
+#    java -jar /path/to/hsqldb.jar mem
+# This will access the first urlid definition below in order to use a 
+# personal Memory-Only database.
+# "url" values may, of course, contain JDBC connection properties, delimited
+# with semicolons.
+
+# If you have the least concerns about security, then secure access to
+# your RC file.
+# See the documentation for SqlTool for various ways to use this file.
+
+# A personal Memory-Only (non-persistent) database.
+# urlid mem
+# url jdbc:hsqldb:mem:memdbid
+# username sa
+# password
+
+# A personal, local, persistent database.
+# urlid personal
+# url jdbc:hsqldb:file:${user.home}/db/personal;shutdown=true
+# username sa
+# password
+# When connecting directly to a file database like this, you should 
+# use the shutdown connection property like this to shut down the DB
+# properly when you exit the JVM.
+
+urlid cTakesUmls
+url jdbc:hsqldb:file:resource/cachedbtemplate/ctakesumls;shutdown=true
+username sa
+password
+
+
+
+# This is for a hsqldb Server running with default settings on your local
+# computer (and for which you have not changed the password for "sa").
+urlid localhost-sa
+url jdbc:hsqldb:hsql://localhost
+# At this time, sa is the default cTakes-GUI user
+username sa
+password
+
+
+
+# Template for a urlid for an Oracle database.
+# You will need to put the oracle.jdbc.OracleDriver class into your 
+# classpath.
+# In the great majority of cases, you want to use the file classes12.zip
+# (which you can get from the directory $ORACLE_HOME/jdbc/lib of any
+# Oracle installation compatible with your server).
+# Since you need to add to the classpath, you can't invoke SqlTool with
+# the jar switch, like "java -jar .../hsqldb.jar..." or 
+# "java -jar .../hsqlsqltool.jar...".
+# Put both the HSQLDB jar and classes12.zip in your classpath (and export!)
+# and run something like "java org.hsqldb.util.SqlTool...".
+
+#urlid cardiff2
+#url jdbc:oracle:thin:@aegir.admc.com:1522:TRAFFIC_SID
+#username blaine
+#password secretpassword
+#driver oracle.jdbc.OracleDriver
+
+
+
+# Template for a TLS-encrypted HSQLDB Server.
+# Remember that the hostname in hsqls (and https) JDBC URLs must match the
+# CN of the server certificate (the port and instance alias that follows 
+# are not part of the certificate at all).
+# You only need to set "truststore" if the server cert is not approved by
+# your system default truststore (which a commercial certificate probably
+# would be).
+
+#urlid tls
+#url jdbc:hsqldb:hsqls://db.admc.com:9001/lm2
+#username blaine
+#password asecret
+#truststore /home/blaine/ca/db/db-trust.store
+
+
+# Template for a Postgresql database
+#urlid blainedb
+#url jdbc:postgresql://idun.africawork.org/blainedb
+#username blaine
+#password losung1
+#driver org.postgresql.Driver
+
+# Template for a MySQL database.  MySQL has poor JDBC support.
+#urlid mysql-testdb
+#url jdbc:mysql://hostname:3306/dbname
+#username root
+#username blaine
+#password hiddenpwd
+#driver com.mysql.jdbc.Driver
+
+# Note that "databases" in SQL Server and Sybase are traditionally used for
+# the same purpose as "schemas" with more SQL-compliant databases.
+
+# Template for a Microsoft SQL Server database
+#urlid msprojsvr
+#url jdbc:microsoft:sqlserver://hostname;DatabaseName=DbName;SelectMethod=Cursor
+# The SelectMethod setting is required to do more than one thing on a JDBC
+# session (I guess Microsoft thought nobody would really use Java for 
+# anything other than a "hello world" program).
+# This is for Microsoft's SQL Server 2000 driver (requires mssqlserver.jar
+# and msutil.jar).
+#driver com.microsoft.jdbc.sqlserver.SQLServerDriver
+#username myuser
+#password hiddenpwd
+
+# Template for a Sybase database
+#urlid sybase
+#url jdbc:sybase:Tds:hostname:4100/dbname
+#username blaine
+#password hiddenpwd
+# This is for the jConnect driver (requires jconn3.jar).
+#driver com.sybase.jdbc3.jdbc.SybDriver
+
+# Template for Embedded Derby / Java DB.
+#urlid derby1
+#url jdbc:derby:path/to/derby/directory;create=true
+#username ${user.name}
+#password any_noauthbydefault
+#driver org.apache.derby.jdbc.EmbeddedDriver
+# The embedded Derby driver requires derby.jar.
+# There'a also the org.apache.derby.jdbc.ClientDriver driver with URL
+# like jdbc:derby://<server>[:<port>]/databaseName, which requires
+# derbyclient.jar.
+# You can use \= to commit, since the Derby team decided (why???)
+# not to implement the SQL standard statement "commit"!!
+# Note that SqlTool can not shut down an embedded Derby database properly,
+# since that requires an additional SQL connection just for that purpose.
+# However, I've never lost data by not shutting it down properly.
+# Other than not supporting this quirk of Derby, SqlTool is miles ahead of ij.
\ No newline at end of file

Propchange: ctakes/sandbox/dictionarytool/resource/cachedbtemplate/sqltool.rc
------------------------------------------------------------------------------
    svn:eol-style = native

Added: ctakes/sandbox/dictionarytool/resource/memdbtemplate/ctakesumls.properties
URL: http://svn.apache.org/viewvc/ctakes/sandbox/dictionarytool/resource/memdbtemplate/ctakesumls.properties?rev=1591988&view=auto
==============================================================================
--- ctakes/sandbox/dictionarytool/resource/memdbtemplate/ctakesumls.properties (added)
+++ ctakes/sandbox/dictionarytool/resource/memdbtemplate/ctakesumls.properties Fri May  2
17:57:05 2014
@@ -0,0 +1,17 @@
+#HSQL Database Engine 1.8.0.10
+#Fri Jan 24 14:13:18 EST 2014
+hsqldb.script_format=0
+runtime.gc_interval=0
+sql.enforce_strict_size=false
+hsqldb.cache_size_scale=8
+readonly=false
+hsqldb.nio_data_file=true
+hsqldb.cache_scale=14
+version=1.8.0
+hsqldb.default_table_type=memory
+hsqldb.cache_file_scale=1
+hsqldb.log_size=200
+modified=no
+hsqldb.cache_version=1.7.0
+hsqldb.original_version=1.8.0
+hsqldb.compatible_version=1.8.0

Propchange: ctakes/sandbox/dictionarytool/resource/memdbtemplate/ctakesumls.properties
------------------------------------------------------------------------------
    svn:eol-style = native

Added: ctakes/sandbox/dictionarytool/resource/memdbtemplate/ctakesumls.script
URL: http://svn.apache.org/viewvc/ctakes/sandbox/dictionarytool/resource/memdbtemplate/ctakesumls.script?rev=1591988&view=auto
==============================================================================
--- ctakes/sandbox/dictionarytool/resource/memdbtemplate/ctakesumls.script (added)
+++ ctakes/sandbox/dictionarytool/resource/memdbtemplate/ctakesumls.script Fri May  2 17:57:05
2014
@@ -0,0 +1,6 @@
+CREATE SCHEMA PUBLIC AUTHORIZATION DBA
+CREATE MEMORY TABLE CTAKES_UMLS(CUI VARCHAR_IGNORECASE(12),TUI VARCHAR_IGNORECASE(48),RINDEX
INTEGER,TCOUNT INTEGER,TEXT VARCHAR_IGNORECASE(255),RWORD VARCHAR_IGNORECASE(48))
+CREATE INDEX IDX_CTAKES_UMLS ON CTAKES_UMLS(RWORD)
+CREATE USER SA PASSWORD ""
+GRANT DBA TO SA
+SET WRITE_DELAY 10

Propchange: ctakes/sandbox/dictionarytool/resource/memdbtemplate/ctakesumls.script
------------------------------------------------------------------------------
    svn:eol-style = native

Added: ctakes/sandbox/dictionarytool/resource/memdbtemplate/sqltool.rc
URL: http://svn.apache.org/viewvc/ctakes/sandbox/dictionarytool/resource/memdbtemplate/sqltool.rc?rev=1591988&view=auto
==============================================================================
--- ctakes/sandbox/dictionarytool/resource/memdbtemplate/sqltool.rc (added)
+++ ctakes/sandbox/dictionarytool/resource/memdbtemplate/sqltool.rc Fri May  2 17:57:05 2014
@@ -0,0 +1,138 @@
+# $Id: sqltool.rc,v 1.22 2007/08/09 03:22:21 unsaved Exp $
+
+# This is a sample RC configuration file used by SqlTool, DatabaseManager,
+# and any other program that uses the org.hsqldb.util.RCData class.
+
+# You can run SqlTool right now by copying this file to your home directory
+# and running
+#    java -jar /path/to/hsqldb.jar mem
+# This will access the first urlid definition below in order to use a 
+# personal Memory-Only database.
+# "url" values may, of course, contain JDBC connection properties, delimited
+# with semicolons.
+
+# If you have the least concerns about security, then secure access to
+# your RC file.
+# See the documentation for SqlTool for various ways to use this file.
+
+# A personal Memory-Only (non-persistent) database.
+# urlid mem
+# url jdbc:hsqldb:mem:memdbid
+# username sa
+# password
+
+# A personal, local, persistent database.
+# urlid personal
+# url jdbc:hsqldb:file:${user.home}/db/personal;shutdown=true
+# username sa
+# password
+# When connecting directly to a file database like this, you should 
+# use the shutdown connection property like this to shut down the DB
+# properly when you exit the JVM.
+
+urlid cTakesUmls
+url jdbc:hsqldb:file:resource/memdbtemplate/ctakesumls;shutdown=true
+username sa
+password
+
+
+
+# This is for a hsqldb Server running with default settings on your local
+# computer (and for which you have not changed the password for "sa").
+urlid localhost-sa
+url jdbc:hsqldb:hsql://localhost
+# At this time, sa is the default cTakes-GUI user
+username sa
+password
+
+
+
+# Template for a urlid for an Oracle database.
+# You will need to put the oracle.jdbc.OracleDriver class into your 
+# classpath.
+# In the great majority of cases, you want to use the file classes12.zip
+# (which you can get from the directory $ORACLE_HOME/jdbc/lib of any
+# Oracle installation compatible with your server).
+# Since you need to add to the classpath, you can't invoke SqlTool with
+# the jar switch, like "java -jar .../hsqldb.jar..." or 
+# "java -jar .../hsqlsqltool.jar...".
+# Put both the HSQLDB jar and classes12.zip in your classpath (and export!)
+# and run something like "java org.hsqldb.util.SqlTool...".
+
+#urlid cardiff2
+#url jdbc:oracle:thin:@aegir.admc.com:1522:TRAFFIC_SID
+#username blaine
+#password secretpassword
+#driver oracle.jdbc.OracleDriver
+
+
+
+# Template for a TLS-encrypted HSQLDB Server.
+# Remember that the hostname in hsqls (and https) JDBC URLs must match the
+# CN of the server certificate (the port and instance alias that follows 
+# are not part of the certificate at all).
+# You only need to set "truststore" if the server cert is not approved by
+# your system default truststore (which a commercial certificate probably
+# would be).
+
+#urlid tls
+#url jdbc:hsqldb:hsqls://db.admc.com:9001/lm2
+#username blaine
+#password asecret
+#truststore /home/blaine/ca/db/db-trust.store
+
+
+# Template for a Postgresql database
+#urlid blainedb
+#url jdbc:postgresql://idun.africawork.org/blainedb
+#username blaine
+#password losung1
+#driver org.postgresql.Driver
+
+# Template for a MySQL database.  MySQL has poor JDBC support.
+#urlid mysql-testdb
+#url jdbc:mysql://hostname:3306/dbname
+#username root
+#username blaine
+#password hiddenpwd
+#driver com.mysql.jdbc.Driver
+
+# Note that "databases" in SQL Server and Sybase are traditionally used for
+# the same purpose as "schemas" with more SQL-compliant databases.
+
+# Template for a Microsoft SQL Server database
+#urlid msprojsvr
+#url jdbc:microsoft:sqlserver://hostname;DatabaseName=DbName;SelectMethod=Cursor
+# The SelectMethod setting is required to do more than one thing on a JDBC
+# session (I guess Microsoft thought nobody would really use Java for 
+# anything other than a "hello world" program).
+# This is for Microsoft's SQL Server 2000 driver (requires mssqlserver.jar
+# and msutil.jar).
+#driver com.microsoft.jdbc.sqlserver.SQLServerDriver
+#username myuser
+#password hiddenpwd
+
+# Template for a Sybase database
+#urlid sybase
+#url jdbc:sybase:Tds:hostname:4100/dbname
+#username blaine
+#password hiddenpwd
+# This is for the jConnect driver (requires jconn3.jar).
+#driver com.sybase.jdbc3.jdbc.SybDriver
+
+# Template for Embedded Derby / Java DB.
+#urlid derby1
+#url jdbc:derby:path/to/derby/directory;create=true
+#username ${user.name}
+#password any_noauthbydefault
+#driver org.apache.derby.jdbc.EmbeddedDriver
+# The embedded Derby driver requires derby.jar.
+# There'a also the org.apache.derby.jdbc.ClientDriver driver with URL
+# like jdbc:derby://<server>[:<port>]/databaseName, which requires
+# derbyclient.jar.
+# You can use \= to commit, since the Derby team decided (why???)
+# not to implement the SQL standard statement "commit"!!
+# Note that SqlTool can not shut down an embedded Derby database properly,
+# since that requires an additional SQL connection just for that purpose.
+# However, I've never lost data by not shutting it down properly.
+# Other than not supporting this quirk of Derby, SqlTool is miles ahead of ij.
\ No newline at end of file

Propchange: ctakes/sandbox/dictionarytool/resource/memdbtemplate/sqltool.rc
------------------------------------------------------------------------------
    svn:eol-style = native



Mime
View raw message