uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lokesh chanana <lokesh.chan...@orkash.com>
Subject Re: Data read problem in UIMA-AS
Date Fri, 20 Aug 2010 13:21:26 GMT
Hello Eddie,

My situation is quite similar to fig 3 but i m using only one queue, one 
service and one runRemotAE instance.

My deployment descriptor is as follows:

/<?xml version="1.0" encoding="UTF-8"?>
<analysisEngineDeploymentDescription 
xmlns="http://uima.apache.org/resourceSpecifier">
<name>deploymentDescriptor</name>
<description/>
<version>1.0</version>
<vendor/>
<deployment protocol="jms" provider="activemq">
<casPool numberOfCASes="1" initialFsHeapSize="2000000"/>
<service>
<inputQueue endpoint="sentenceQueue" brokerURL="${defaultBrokerURL}" 
prefetch="0"/>
<topDescriptor>
<import location="flowControllerDescriptor.xml"/>
</topDescriptor>
<analysisEngine async="false">
<scaleout numberOfInstances="1"/>
<asyncPrimitiveErrorConfiguration>
<processCasErrors thresholdCount="0" thresholdWindow="0" 
thresholdAction="terminate"/>
<collectionProcessCompleteErrors timeout="0" 
additionalErrorAction="terminate"/>
</asyncPrimitiveErrorConfiguration>
</analysisEngine>
</service>
</deployment>
</analysisEngineDeploymentDescription>/

And my flocontroller descriptor is..

/<?xml version="1.0" encoding="UTF-8"?>
<analysisEngineDescription xmlns="http://uima.apache.org/resourceSpecifier">
<frameworkImplementation>org.apache.uima.java</frameworkImplementation>
<primitive>false</primitive>
<delegateAnalysisEngineSpecifiers>
<delegateAnalysisEngine key="CollectionReader">
<import 
location="../../examples/descriptors/collection_reader/FileSystemCollectionReader.xml"/>
</delegateAnalysisEngine>
<delegateAnalysisEngine key="Sentence">
<import location="OpenNLPSentenceDetector.xml"/>
</delegateAnalysisEngine>
<delegateAnalysisEngine key="CasConsumer">
<import 
location="../../examples/descriptors/cas_consumer/XmiWriterCasConsumer.xml"/>
</delegateAnalysisEngine>
</delegateAnalysisEngineSpecifiers>
<analysisEngineMetaData>
<name>sentence</name>
<description>Implements a collection processing engine including 
collection reader,
                  analysis engines and cas consumer.</description>
<configurationParameters/>
<configurationParameterSettings/>
<flowConstraints>
<fixedFlow>
<node>CollectionReader</node>
<node>Sentence</node>
<node>CasConsumer</node>
</fixedFlow>
</flowConstraints>
<capabilities>
<capability>
<inputs/>
<outputs>
<type 
allAnnotatorFeatures="true">org.apache.uima.examples.tokenizer.Sentence</type>
</outputs>
<languagesSupported>
<language>en</language>
</languagesSupported>
</capability>
</capabilities>
<operationalProperties>
<modifiesCas>true</modifiesCas>
<multipleDeploymentAllowed>false</multipleDeploymentAllowed>
<outputsNewCASes>false</outputsNewCASes>
</operationalProperties>
</analysisEngineMetaData>
</analysisEngineDescription>/

As i am quite new so kindly guide me where i m wrong...

Regards
Lokesh



On Friday 20 August 2010 06:33 PM, Eddie Epstein wrote:
> Sorry, it is a bit hard to understand your scenario. Please clarify
> using the examples
> in http://uima.apache.org/doc-uimaas-what.html and identify which best
> fits your situation,
> Figure 3, 4 or 5.
>
> Thanks,
> Eddie
>
> On Thu, Aug 19, 2010 at 6:12 AM, lokesh chanana
> <lokesh.chanana@orkash.com>  wrote:
>    
>> Hello,
>>
>> I am deploying UIMA-AS for testing perpose by now.
>>
>> My configurations includes
>>
>> !. Broker on one system<say Broker>
>> 2. Service on one system<say Service>
>> 3. Client on one system<say client>
>>
>> My configuration
>> =>  my deployment descriptor calls the flowControlAggregate.
>> =>  Flow control defines 3 delegates
>>     1. collection Reader that reads data from /opt/apache-uima/examples/data
>>     2. my core analysis Engine that process the data
>>     3. XmiWriterCasConsumer.xml class to generete output.
>>
>> i deployed them as
>> <On Broker>  startBroker.sh
>>
>> <On Service>  deployAsyncService.sh /Path/to/deployment/descriptor.xml
>> -brokerURL tcp://<BROKER>:61616
>>
>> <on client>  runRemoteAsyncAE.sh tcp://<BROKER>:61616 MeetingFinderQueue
-c
>> /Path/to/collectin/reader.xml -o result.xmi
>>
>>
>> NOW the problems i m concerned about are...
>>
>> If I don't define collection_Reader then i get no error but it says only one
>> file is processed. (I knw that it is due to fact that client send an empty
>> CAS). this means that client is sending the data in form of CAS. if so then
>> y a cas collector if compulsaryin my flowControlAggregator???i tried emoving
>> it but the service didn't started.
>>
>>     I found one answer to this that CAS at client is just sending the
>> reference for CAS at SERVICE.
>>     To test this i deleted the whole data at
>> SERVICE(/opt/apache-uima/examples/data).
>>     Unexpectedly my processing time got reduced to some 20 sec. (which was
>> earlier 3200 sec. for 500 documents).
>>
>>     now i have two cases
>> case 1: when the data at client and service is same.
>>      
>>> processing is too slow.
>>> xmi file ganerated are of shoter length
>>> the XMI files are different then those created by the basic UIMA
>>> application
>>> processing at SERVICE is too high
>>> neglegible processing at client.
>>>        
>> case 2: when the data there is no data at SERVICE end
>>      
>>> Processing is fast
>>> XMI are same as those creaed by local uima.
>>> no much proocessiong either on client or SERVICE end.
>>>        
>> Now as things are not as i expected I am sure i m somewhere wrong
>> conceptually. As i am quite new to UIMA-As any help is apprecable.
>>
>> Regards
>> Lokesh
>>
>>
>>      


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message