uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marshall Schor <...@schor.com>
Subject Re: UIMA Addons ConceptMapper - where to start
Date Thu, 18 Sep 2014 15:24:44 GMT
Hi Debbie,

Thanks for sharing with the community :-)

On 9/18/2014 5:56 AM, Debbie Zhang wrote:
> Marshall, you are right. It is the import problem.
> It took me a little while to figure this out. So I think I should share with
> others so new users won't have this problem again. 
> The main example descriptor in ConceptMapper is: 
> desc/analysis_engine/primitive/ConceptMapperOffsetTokenizer.xml
> To make it work, apart from copying the descriptor xml files to your desc
> folder, you need to do the following:
> 1. Open the following files using a text editor and delete all elements
> inside the import tags
> 	OffsetTokenizerMatcher.xml
> 	ConceptMapperOffsetTokenizer.xml
> 2. Copy a TokenAnnotion.xml from the ConceptMapper package in the source
> release to your desc folder
> 3. Open the above xml files using Component Descriptor Editor and manually
> add the deleted items back to using import tag
> Regards,
> Debbie
>> -----Original Message-----
>> From: Marshall Schor [mailto:msa@schor.com]
>> Sent: Wednesday, 12 March 2014 6:08 AM
>> To: user@uima.apache.org
>> Subject: Re: UIMA Addons ConceptMapper - where to start
>> On 3/4/2014 5:01 AM, Debbie Zhang wrote:
>>> Thanks Marshall!
>>>> Well, adding the UIMA nature to your project, surprised me a bit.
>>>> You would do this if you wanted to package the results of your work
>>>> as a PEAR package, after you've finished creating an annotator
>>>> pipeline you wish to deliver to others to use.
>>> Yes, after creating annotators, we will use IBM Content Analytics to
>>> display the results. Therefore, we need to package them as PEAR
>> packages.
>>>>> I can use "Component Descriptor Editor" to open DictTerm.xml and
>>>>> OffsetTokenizer.xml. However, when I use "Component Descriptor
>>>> Editor"
>>>>> to open ConceptMapperOffsetTokenizer.xml and
>>>>> OffsetTokenizerMatcher.xml, I have the following error:
>>>>> The descriptor has one or more errors. Please fix in the source
>>>> editor.
>>>>> ResourceInitializationException: An import could not be resolved.
>> No
>>>>> file with name "org/apache/uima/conceptMapper/DictTerm.xml" was
>>>>> found in the class path or data path.
>> (Descriptor:file:/C/Work/Java_Workspace/ConceptMapperTest/desc/analys
>>>> i
>>>>> s_engi
>>>>> ne/primitive
>>>>> /ConceptMapperOffsetTokenizer.xml)
>>>> The Component Descriptor Editor needs to be able to find descriptors
>>>> that are references.  Descriptors are referenced in two ways: by
>> name
>>>> and by location.
>>>> By location is a relative reference; by name looks things up in the
>>>> classpath (in this case in the classpath Eclipse uses for the
>> project
>>>> containing the descriptor being edited).  See:
>>>> http://uima.apache.org/d/uimaj-
>>>> 2.5.0/references.html#ugr.ref.xml.component_descriptor.imports
>>>> The simplest thing to do to correct this kind of error is to put the
>>>> directory containing the referenced descriptor on Eclipse's source
>>>> class path.  (you do this by some menu action - like right clicking
>>>> the directory containing the descriptors you want to be able to find,
>>>> in the PackageExplorer view of Eclipse, and selecting "Build Path" ->
>>>> "add to Build Path".
>>> As the uima-an-conceptMapper.jar is in the class path, DictTerm.xml
>>> and other descriptors are able to be found by Name. However, due to
>>> the file structure in jar file, the file is listed under by-name xml
>>> resource as "analysis_engine/primitive/DictTerm.xml" instead of
>>> "org/apache/uima/conceptMapper/DictTerm.xml". I think that is how
>> that
>>> error came from.
>>> The reason I want to open these descriptors was I want to see how
>>> these descriptors can be used. So I also created a new descriptor
>>> "TestConceptMapperDescriptor.xml" and try to use the ConceptMapper
>>> annotator here. I am able to import
>> analysis_engine.primitive.DictTerm
>>> at the "Type System" tab. Under the "Capabilities" tab, I set
>> DictTerm
>>> as Input and TestConceptMapperDescriptor as Output. This is as far as
>>> I can go. I am not sure how to set dictionary to be used by the
>>> DictTerm and set the configurations of the parameters.
>>> I am not able to import other Descriptors such as OffsetTokenizer.xml
>>> with the following errors:
>>> An error was caused by adding Import(s); operation cancelled. Please
>>> correct the Error and retry.
>>> ResourceInitializationException: An object of class
>>> Org.apache.uima.resource.metadata.TypeSystemDescription was requested,
>>> but the XML input contained an object of class
>>> org.apache.uima.analysis_engine.impl.TaeDescription_impl.
>> This error probably means the descriptor has an import statement which
>> pointed to (I'm guessing) OffsetTokenizer), which is not a type system
>> descriptor.
>> In the main descriptor (the one with import statements), there are
>> multiple xml elements which can contain imports.  Is the import in the
>> right spot?  If you can't figure this out, can you post the relevant
>> descriptors?
>> -Marshall
>>> Thanks Marshall for your help again - much appreciated!
>>> Regards,
>>> Debbie

View raw message