uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Tanenblatt <sloth...@park-slope.net>
Subject Re: ConceptMapper: Performance Matrix
Date Mon, 23 Jun 2008 18:37:39 GMT
The short answer is "no". Not yet, anyway.

But, here are some things that might help. First, if dictionary  
loading times are long, you can use the command line tool supplied in  
the package to compile the dictionary, and use the compiled  
dictionary. If you do this, remember that you will need to change the  
AE descriptors to use the correct implementation of the dictionary  
loader, e.g.:

<externalResource>
	...
	 
< 
implementationName 
 > 
org 
.apache 
.uima 
.conceptMapper 
.support.dictionaryResource.CompiledDictionaryResource_impl</ 
implementationName>
	...
</externalResource>

That said, if you are using 13 dictionaries, that means you are  
running 13 copies of ConceptMapper in your pipeline, which means that  
you are traversing each file's text  at 13 times just for your  
ConceptMapper invocations. If you could merge the dictionaries into  
one, you should see a marked speedup. Clearly, it a near-term  
enhancement of ConceptMapper would be to enable the loading of  
multiple dictionaries, which get merged at initialization time.

One side note: I am going to be on vacation starting on June 25 and  
will only have occasional access to email until I return on July 12. I  
will try to answer questions during that time when I do have access,  
but I really have no idea how often that will be.


On Jun 23, 2008, at 2:19 PM, Ahmed Abdeen Hamed wrote:

> Hello UIMA members,I am using the document analyzer example to  
> analyze large
> files from multiple dictionaries. One of the raw files is 7.5MB. The  
> number
> of dictionaries is 13, 1MB is the size of each. Is there some sort  
> of a
> matrix that you can use to predict the execution time? Has any one  
> written a
> paper on the performance analysis of ConceptMapper?
> Please let me know if you can.
> Best wishes,
> --------------------------------------------------------
> Ahmed Abdeen Hamed
> Scientific Informatics Project Leader
> MBLWHOI Library
> Marine Biological Laboratory
> 7 MBL Street Woods Hole, MA 02543 USA
> +1 508 289 7676
> --
> email: abdeen@mbl.edu
> --


Mime
View raw message