uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lokesh chanana <lokesh.chan...@orkash.com>
Subject Re: Data read problem in UIMA-AS
Date Thu, 26 Aug 2010 05:13:55 GMT
Hello Eddie,

Once again thanks for replying.
I tried removing the collection reader.. this thing now works the same 
way it worked in my CASE 2* .

The main thing I am concern with that in this case I need to send a 
loads of data over the net.. can't there be a way by which I could have 
rested my data on machine hosting the remote service.. and would have 
called my client without a collection reader.. or either providing only 
reference or some thing so that my network traffic could be lesser n 
processing be bit faster?? as i would practically never want to download 
the data first to client from my database and then  sending it to remote 
machine..

Secondly, when I define the the collection reader with the remote 
service and it even have data in the local system... what makes the 
processing go extremely slow?? indeed I was expecting it to be more 
faster... am I wrong somewhere??

Regards,
Lokesh

On Saturday 21 August 2010 03:50 AM, Eddie Epstein wrote:
> Lokesh,
>
> The scenario in figure 3 uses a collection reader at the driver, and
> no collection reader in the remote service. Please try that.
>
> regards,
> Eddie
>
>
> On Fri, Aug 20, 2010 at 9:21 AM, lokesh chanana
> <lokesh.chanana@orkash.com>  wrote:
>    
>> Hello Eddie,
>>
>> My situation is quite similar to fig 3 but i m using only one queue, one
>> service and one runRemotAE instance.
>>
>> My deployment descriptor is as follows:
>>
>> /<?xml version="1.0" encoding="UTF-8"?>
>> <analysisEngineDeploymentDescription
>> xmlns="http://uima.apache.org/resourceSpecifier">
>> <name>deploymentDescriptor</name>
>> <description/>
>> <version>1.0</version>
>> <vendor/>
>> <deployment protocol="jms" provider="activemq">
>> <casPool numberOfCASes="1" initialFsHeapSize="2000000"/>
>> <service>
>> <inputQueue endpoint="sentenceQueue" brokerURL="${defaultBrokerURL}"
>> prefetch="0"/>
>> <topDescriptor>
>> <import location="flowControllerDescriptor.xml"/>
>> </topDescriptor>
>> <analysisEngine async="false">
>> <scaleout numberOfInstances="1"/>
>> <asyncPrimitiveErrorConfiguration>
>> <processCasErrors thresholdCount="0" thresholdWindow="0"
>> thresholdAction="terminate"/>
>> <collectionProcessCompleteErrors timeout="0"
>> additionalErrorAction="terminate"/>
>> </asyncPrimitiveErrorConfiguration>
>> </analysisEngine>
>> </service>
>> </deployment>
>> </analysisEngineDeploymentDescription>/
>>
>> And my flocontroller descriptor is..
>>
>> /<?xml version="1.0" encoding="UTF-8"?>
>> <analysisEngineDescription xmlns="http://uima.apache.org/resourceSpecifier">
>> <frameworkImplementation>org.apache.uima.java</frameworkImplementation>
>> <primitive>false</primitive>
>> <delegateAnalysisEngineSpecifiers>
>> <delegateAnalysisEngine key="CollectionReader">
>> <import
>> location="../../examples/descriptors/collection_reader/FileSystemCollectionReader.xml"/>
>> </delegateAnalysisEngine>
>> <delegateAnalysisEngine key="Sentence">
>> <import location="OpenNLPSentenceDetector.xml"/>
>> </delegateAnalysisEngine>
>> <delegateAnalysisEngine key="CasConsumer">
>> <import
>> location="../../examples/descriptors/cas_consumer/XmiWriterCasConsumer.xml"/>
>> </delegateAnalysisEngine>
>> </delegateAnalysisEngineSpecifiers>
>> <analysisEngineMetaData>
>> <name>sentence</name>
>> <description>Implements a collection processing engine including collection
>> reader,
>>                  analysis engines and cas consumer.</description>
>> <configurationParameters/>
>> <configurationParameterSettings/>
>> <flowConstraints>
>> <fixedFlow>
>> <node>CollectionReader</node>
>> <node>Sentence</node>
>> <node>CasConsumer</node>
>> </fixedFlow>
>> </flowConstraints>
>> <capabilities>
>> <capability>
>> <inputs/>
>> <outputs>
>> <type
>> allAnnotatorFeatures="true">org.apache.uima.examples.tokenizer.Sentence</type>
>> </outputs>
>> <languagesSupported>
>> <language>en</language>
>> </languagesSupported>
>> </capability>
>> </capabilities>
>> <operationalProperties>
>> <modifiesCas>true</modifiesCas>
>> <multipleDeploymentAllowed>false</multipleDeploymentAllowed>
>> <outputsNewCASes>false</outputsNewCASes>
>> </operationalProperties>
>> </analysisEngineMetaData>
>> </analysisEngineDescription>/
>>
>> As i am quite new so kindly guide me where i m wrong...
>>
>> Regards
>> Lokesh
>>
>>
>>
>> On Friday 20 August 2010 06:33 PM, Eddie Epstein wrote:
>>      
>>> Sorry, it is a bit hard to understand your scenario. Please clarify
>>> using the examples
>>> inhttp://uima.apache.org/doc-uimaas-what.html  and identify which best
>>> fits your situation,
>>> Figure 3, 4 or 5.
>>>
>>> Thanks,
>>> Eddie
>>>
>>> On Thu, Aug 19, 2010 at 6:12 AM, lokesh chanana
>>> <lokesh.chanana@orkash.com>    wrote:
>>>
>>>        
>>>> Hello,
>>>>
>>>> I am deploying UIMA-AS for testing perpose by now.
>>>>
>>>> My configurations includes
>>>>
>>>> !. Broker on one system<say Broker>
>>>> 2. Service on one system<say Service>
>>>> 3. Client on one system<say client>
>>>>
>>>> My configuration
>>>> =>    my deployment descriptor calls the flowControlAggregate.
>>>> =>    Flow control defines 3 delegates
>>>>     1. collection Reader that reads data from
>>>> /opt/apache-uima/examples/data
>>>>     2. my core analysis Engine that process the data
>>>>     3. XmiWriterCasConsumer.xml class to generete output.
>>>>
>>>> i deployed them as
>>>> <On Broker>    startBroker.sh
>>>>
>>>> <On Service>    deployAsyncService.sh /Path/to/deployment/descriptor.xml
>>>> -brokerURL tcp://<BROKER>:61616
>>>>
>>>> <on client>    runRemoteAsyncAE.sh tcp://<BROKER>:61616 MeetingFinderQueue
>>>> -c
>>>> /Path/to/collectin/reader.xml -o result.xmi
>>>>
>>>>
>>>> NOW the problems i m concerned about are...
>>>>
>>>> If I don't define collection_Reader then i get no error but it says only
>>>> one
>>>> file is processed. (I knw that it is due to fact that client send an
>>>> empty
>>>> CAS). this means that client is sending the data in form of CAS. if so
>>>> then
>>>> y a cas collector if compulsaryin my flowControlAggregator???i tried
>>>> emoving
>>>> it but the service didn't started.
>>>>
>>>>     I found one answer to this that CAS at client is just sending the
>>>> reference for CAS at SERVICE.
>>>>     To test this i deleted the whole data at
>>>> SERVICE(/opt/apache-uima/examples/data).
>>>>     Unexpectedly my processing time got reduced to some 20 sec. (which was
>>>> earlier 3200 sec. for 500 documents).
>>>>
>>>>     now i have two cases
>>>> case 1: when the data at client and service is same.
>>>>
>>>>          
>>>>> processing is too slow.
>>>>> xmi file ganerated are of shoter length
>>>>> the XMI files are different then those created by the basic UIMA
>>>>> application
>>>>> processing at SERVICE is too high
>>>>> neglegible processing at client.
>>>>>
>>>>>            
>>>> case 2: when the data there is no data at SERVICE end
>>>>
>>>>          
>>>>> Processing is fast
>>>>> XMI are same as those creaed by local uima.
>>>>> no much proocessiong either on client or SERVICE end.
>>>>>
>>>>>            
>>>> Now as things are not as i expected I am sure i m somewhere wrong
>>>> conceptually. As i am quite new to UIMA-As any help is apprecable.
>>>>
>>>> Regards
>>>> Lokesh
>>>>
>>>>
>>>>
>>>>          
>>      


Mime
View raw message