ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From William Karl Thompson <...@northwestern.edu>
Subject RE: InputSteam instead of java.io.File
Date Tue, 11 Jun 2013 18:07:42 GMT
Issue (1) is something I've encountered too, in the SuffixMaxentModelResourceImpl class. There
is a call to DataResource.getUrl() which doesn't work if the resource is located in a jar
file. Replacing this with the following code (starting on line 55) fixed the problem:

                //File modelFile = new File(dr.getUri());
       	InputStream is = dr.getInputStream();
        	DataReader dataReader = new PlainTextFileDataReader(is);
        	GISModelReader modelReader = new GISModelReader(dataReader);
            	iv_maxentModel = modelReader.getModel();

-----Original Message-----
From: Chen, Pei [mailto:Pei.Chen@childrens.harvard.edu] 
Sent: Tuesday, June 11, 2013 12:50 PM
To: dev@ctakes.apache.org
Subject: InputSteam instead of java.io.File

While working on the test cases in cTAKES, I've encountered couple of issues and suggestions:

1)      File or Url.getRawPath() became problematic if they are read in from the jars from
the classpath and which couldn't resolve to a physical File.

a.       Suggestion: Wherever possible, replace loading of resouces via java.io.File with
InputStream instead.  . We can add a new method in the FileLocator util and deprecate the
old File method.

2)      Sentence Dectector is still using the OpenNLP 1.4 mechanism of loading it's model

a.       Suggestion: Let's update it to use the new 1.5 way similar to POSTagger.  (Remove
non longer required classes: SuffixMaxentModelResourceImpl, MaxentModelResource, SuffixSensitiveGISModelReader,
classes etc.)

Certain unit tests fail because they can't be resolved via jars from the classpath because
the code is explicitly looking for File on disk instead of input stream.  But in order to
solve it appropriately, it had a cascading effect and required a lot more changes, but it's
probably a good time to update those projects anyhow.


View raw message