uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "reshu.agarwal" <reshu.agar...@orkash.com>
Subject Re: status Lost=1 in DUCC
Date Thu, 27 Mar 2014 05:31:48 GMT
On 03/26/2014 10:06 PM, Lou DeGenaro wrote:
> Hi Reshu,
>
> re: your answers to 5 & 6
>
> 6a. Is the data that populates the CAS the "name" of a document or the
> document itself?  (The expected expected use of DUCC is to *not* pass the
> document contents which may, for example, be very large)
>
> 6b.  If it is a "name" or the like, is that something you can share so I
> can try to reproduce here?
>
> Lou.
>
>
> On Wed, Mar 26, 2014 at 9:20 AM, reshu.agarwal <reshu.agarwal@orkash.com>wrote:
>
>> Hi Lou,
>>
>>
>> On 03/26/2014 04:27 PM, Lou DeGenaro wrote:
>>
>>> Hi Reshu,
>>>
>>> The good news is that DUCC is functional since 1.job works.  So we need to
>>> find out why your particular job fails.
>>>
>>> A few more questions:
>>>
>>> 5. Does your job consist of multiple work items (CASes), and do any of
>>> them
>>> succeed?
>>>
>> My job consists of multiple work Items as well as I have tried a job with
>> single document. These both type of jobs are succeeded many times but I got
>> a problem like this on a particular document with in job. if I exclude this
>> document, my job got succeeded.
>>
>>
>>   6. DUCC has Job Driver (JD) that employs your CollectionReader (CR) to
>>> fetch CASes that are sent via a broker for processing by one of the
>>> distributed Job Processes (JPs) that each run a copy of your
>>> AnaylsisEngine
>>> (AE).  Normally, as Eddie points out, these CASes comprise some index
>>> that's interpreted by the assigned JP to know which data is to be worked
>>> on.  For example, say you have 100 documents, each 5GB in size named
>>> doc.1,
>>> doc.2,...doc.100.  Your CR sound not pass the actual 5GB document, but
>>> rather "doc.1".  Is that the kind of scheme your are employing?
>>>
>> Lou, I am fetching Batch data from Database and sending reference from the
>> result set to Cas. I am not using File Processing.
>>
>>   7. Do you have a small test case that you can share that reliably
>>> demonstrates the problem?
>>>
>> Test Case:
>>
>> I have two systems with in DUCC cluster with 20 GB RAM each.
>> I have defined job with these configurations:
>>
>> classpath_order         ducc-before-user
>> driver_descriptor_CR    ../collection_reader/DBCollectionReader.xml
>> process_deployments_max         6
>> process_descriptor_AE   ../aeAggregate
>> process_descriptor_CC   ../cas_consumer/CASConsumer
>> process_failures_limit  50
>> process_memory_size     4
>> process_per_item_time_max       3
>> process_thread_count    3
>> specification   22.job
>> working_directory       ../ducc/Uima_ducc
>>
>>
>>
>> I am fetching Data from Database in CR. After executing getNext() method
>> of CR for the particular document, It prints warning message in JD.log like
>> this
>>
>> Mar 26, 2014 9:40:25 AM org.apache.uima.adapter.jms.client.
>> BaseUIMAAsynchronousEngineCommon_impl sendAndReceiveCAS
>> WARNING:
>>
>> The document remains in queue till 5 minutes i.e. equals to the queue
>> waiting time.
>>
>> Then if the batch size is 100 it shows lost=1 else if 200 then it still
>> remain in queue until I forcefully terminate the job.
>>
>>
>>
>>> Lou.
>>>
>>>
>>>
>>>
>>> On Wed, Mar 26, 2014 at 5:31 AM, reshu.agarwal<reshu.agarwal@orkash.com
>>>> wrote:
>>>   On 03/20/2014 06:35 PM, Lou DeGenaro wrote:
>>>>   Where does the warning appear, in a log file in the job's log
>>>>> directory?  Is there any other information related to that warning?
>>>>>
>>>>>   Hi Lou,
>>>> Answers of your questions are given below. Hope it will help:
>>>>
>>>>
>>>> 1. Are you able to run a simple job, such as 1.job from the examples
>>>> directory successfully?
>>>>
>>>> Yes, I am able to run that simple job successfully.
>>>>
>>>>
>>>> 2. Where does the warning appear, in a log file in the job's log
>>>> directory?  Is there any other information related to that warning?
>>>>
>>>> This warning appears in JD.log file.
>>>>
>>>> After all initialization messages and these messages come:
>>>>
>>>> Mar 26, 2014 9:49:04 AM org.apache.uima.adapter.jms.client.
>>>> BaseUIMAAsynchronousEngine_impl setupConnection
>>>> INFO: UIMA AS Client Created Shared Connection To Broker:
>>>> tcp://S1:61616?wireFormat.maxInactivityDuration=0&jms.
>>>> useCompression=true&
>>>> closeAsync=false
>>>> Mar 26, 2014 9:49:04 AM org.apache.uima.adapter.jms.client.
>>>> BaseUIMAAsynchronousEngine_impl initializeProducer
>>>> INFO: Initializing JMS Message Producer. Broker:
>>>> tcp://S1:61616?wireFormat.
>>>> maxInactivityDuration=0&jms.useCompression=true&closeAsync=false
Queue
>>>> Name: ducc.jd.queue.1317
>>>> Mar 26, 2014 9:49:04 AM org.apache.uima.adapter.jms.client.
>>>> BaseUIMAAsynchronousEngine_impl initializeConsumer
>>>> INFO: Initializing JMS Message Consumer. Broker:
>>>> tcp://S1:61616?wireFormat.
>>>> maxInactivityDuration=0&jms.useCompression=true&closeAsync=false
Queue
>>>> Name: ID:S144-36678-1395807465286-7:1:1
>>>> Mar 26, 2014 9:49:04 AM org.apache.uima.adapter.jms.client.
>>>> BaseUIMAAsynchronousEngine_impl initialize
>>>> INFO: Asynchronous Client Has Been Initialized. Serialization Strategy:
>>>> [SerializationStrategy] Ready To Process.
>>>>
>>>> and then only this warning message comes:
>>>>
>>>> Mar 26, 2014 9:49:27 AM org.apache.uima.adapter.jms.client.
>>>> BaseUIMAAsynchronousEngineCommon_impl sendAndReceiveCAS
>>>> WARNING:
>>>> then this messages come:
>>>>
>>>> Mar 26, 2014 9:59:45 AM org.apache.uima.adapter.jms.client.
>>>> BaseUIMAAsynchronousEngineCommon_impl stop
>>>> INFO: Stopping Asynchronous Client.
>>>> Mar 26, 2014 9:59:45 AM org.apache.uima.adapter.jms.client.
>>>> BaseUIMAAsynchronousEngineCommon_impl stop
>>>> INFO: Asynchronous Client Has Stopped.
>>>> Mar 26, 2014 9:59:45 AM org.apache.uima.adapter.jms.client.
>>>> BaseUIMAAsynchronousEngineCommon_impl$SharedConnection destroy
>>>> INFO: UIMA AS Client Shared Connection Has Been Closed  Mar 26, 2014
>>>> 9:59:45 AM org.apache.uima.adapter.jms.client.
>>>> BaseUIMAAsynchronousEngine_impl
>>>> stop
>>>>
>>>>
>>>>
>>>> 3. Are there any exceptions in any of the logs in the job's log
>>>> directory?
>>>>
>>>> Yes, When this warning message comes then after successfully processing
>>>> of
>>>> all documents from DB collection Reader instead of this particular
>>>> document. This Message shows in one of the Process's log file i.e.:
>>>>
>>>> Mar 26, 2014 9:54:04 AM org.apache.uima.adapter.jms.
>>>> activemq.JmsOutputChannel$ConnectionTimer startSessionReaperTimer.run
>>>> INFO: Thread: 210 Component: CorefernceAggDescriptor Jms Session
>>>> Inactivity Timeout: 5 Minutes on Broker: tcp://S1:61616?wireFormat.
>>>> maxInactivityDuration=0&closeAsync=false
>>>>
>>>> I think this is due to that warning.
>>>>
>>>>
>>>> 4. Does your job use a version of UIMA/UIMA-AS that is different than the
>>>> one used by DUCC?
>>>>
>>>> I am using DUCC version 1.0.0 and UIMA version 2.4.2. I am not able to
>>>> get
>>>> DUCC UIMA version.
>>>>
>>>>
>>>> --
>>>> Thanks and Regards,
>>>> Reshu Agarwal
>>>> Software Engineer
>>>> Orkash Services Pvt Ltd
>>>>
>>>>
>>>>
>> Reshu.
>>
Hi Lou,

I am sending the reference of document like the code given below:

String originalText =  v_result.getString("content").toString(); 
//v_result is the object of ResultSet of Database

JCas jcas;
             try {
                 jcas = aCAS.getJCas();
             } catch (CASException e) {
                 throw new CollectionException(e);
             }

jcas.setDocumentText(originalText);

-- 
Thanks,
Reshu Agarwal


Mime
View raw message