uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lou DeGenaro <lou.degen...@gmail.com>
Subject Re: status Lost=1 in DUCC
Date Thu, 27 Mar 2014 14:43:32 GMT
Hi Reshu,

It looks like the data being sent are "values" rather than "keys" in your
CAS?  If so, this is not really a "best practice" for DUCC use.

Is there an example of the the failing data that you can share.

Also, could you please make available in their entirety all of the logs in
the user's log directory, for example jd.out.log, *JD*.log...

Lou.


On Thu, Mar 27, 2014 at 1:31 AM, reshu.agarwal <reshu.agarwal@orkash.com>wrote:

> On 03/26/2014 10:06 PM, Lou DeGenaro wrote:
>
>> Hi Reshu,
>>
>> re: your answers to 5 & 6
>>
>> 6a. Is the data that populates the CAS the "name" of a document or the
>> document itself?  (The expected expected use of DUCC is to *not* pass the
>> document contents which may, for example, be very large)
>>
>> 6b.  If it is a "name" or the like, is that something you can share so I
>> can try to reproduce here?
>>
>> Lou.
>>
>>
>> On Wed, Mar 26, 2014 at 9:20 AM, reshu.agarwal <reshu.agarwal@orkash.com>
>> wrote:
>>
>>  Hi Lou,
>>>
>>>
>>> On 03/26/2014 04:27 PM, Lou DeGenaro wrote:
>>>
>>>  Hi Reshu,
>>>>
>>>> The good news is that DUCC is functional since 1.job works.  So we need
>>>> to
>>>> find out why your particular job fails.
>>>>
>>>> A few more questions:
>>>>
>>>> 5. Does your job consist of multiple work items (CASes), and do any of
>>>> them
>>>> succeed?
>>>>
>>>>  My job consists of multiple work Items as well as I have tried a job
>>> with
>>> single document. These both type of jobs are succeeded many times but I
>>> got
>>> a problem like this on a particular document with in job. if I exclude
>>> this
>>> document, my job got succeeded.
>>>
>>>
>>>   6. DUCC has Job Driver (JD) that employs your CollectionReader (CR) to
>>>
>>>> fetch CASes that are sent via a broker for processing by one of the
>>>> distributed Job Processes (JPs) that each run a copy of your
>>>> AnaylsisEngine
>>>> (AE).  Normally, as Eddie points out, these CASes comprise some index
>>>> that's interpreted by the assigned JP to know which data is to be worked
>>>> on.  For example, say you have 100 documents, each 5GB in size named
>>>> doc.1,
>>>> doc.2,...doc.100.  Your CR sound not pass the actual 5GB document, but
>>>> rather "doc.1".  Is that the kind of scheme your are employing?
>>>>
>>>>  Lou, I am fetching Batch data from Database and sending reference from
>>> the
>>> result set to Cas. I am not using File Processing.
>>>
>>>   7. Do you have a small test case that you can share that reliably
>>>
>>>> demonstrates the problem?
>>>>
>>>>  Test Case:
>>>
>>> I have two systems with in DUCC cluster with 20 GB RAM each.
>>> I have defined job with these configurations:
>>>
>>> classpath_order         ducc-before-user
>>> driver_descriptor_CR    ../collection_reader/DBCollectionReader.xml
>>> process_deployments_max         6
>>> process_descriptor_AE   ../aeAggregate
>>> process_descriptor_CC   ../cas_consumer/CASConsumer
>>> process_failures_limit  50
>>> process_memory_size     4
>>> process_per_item_time_max       3
>>> process_thread_count    3
>>> specification   22.job
>>> working_directory       ../ducc/Uima_ducc
>>>
>>>
>>>
>>> I am fetching Data from Database in CR. After executing getNext() method
>>> of CR for the particular document, It prints warning message in JD.log
>>> like
>>> this
>>>
>>> Mar 26, 2014 9:40:25 AM org.apache.uima.adapter.jms.client.
>>> BaseUIMAAsynchronousEngineCommon_impl sendAndReceiveCAS
>>> WARNING:
>>>
>>> The document remains in queue till 5 minutes i.e. equals to the queue
>>> waiting time.
>>>
>>> Then if the batch size is 100 it shows lost=1 else if 200 then it still
>>> remain in queue until I forcefully terminate the job.
>>>
>>>
>>>
>>>  Lou.
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Mar 26, 2014 at 5:31 AM, reshu.agarwal<reshu.agarwal@orkash.com
>>>>
>>>>> wrote:
>>>>>
>>>>   On 03/20/2014 06:35 PM, Lou DeGenaro wrote:
>>>>
>>>>>   Where does the warning appear, in a log file in the job's log
>>>>>
>>>>>> directory?  Is there any other information related to that warning?
>>>>>>
>>>>>>   Hi Lou,
>>>>>>
>>>>> Answers of your questions are given below. Hope it will help:
>>>>>
>>>>>
>>>>> 1. Are you able to run a simple job, such as 1.job from the examples
>>>>> directory successfully?
>>>>>
>>>>> Yes, I am able to run that simple job successfully.
>>>>>
>>>>>
>>>>> 2. Where does the warning appear, in a log file in the job's log
>>>>> directory?  Is there any other information related to that warning?
>>>>>
>>>>> This warning appears in JD.log file.
>>>>>
>>>>> After all initialization messages and these messages come:
>>>>>
>>>>> Mar 26, 2014 9:49:04 AM org.apache.uima.adapter.jms.client.
>>>>> BaseUIMAAsynchronousEngine_impl setupConnection
>>>>> INFO: UIMA AS Client Created Shared Connection To Broker:
>>>>> tcp://S1:61616?wireFormat.maxInactivityDuration=0&jms.
>>>>> useCompression=true&
>>>>> closeAsync=false
>>>>> Mar 26, 2014 9:49:04 AM org.apache.uima.adapter.jms.client.
>>>>> BaseUIMAAsynchronousEngine_impl initializeProducer
>>>>> INFO: Initializing JMS Message Producer. Broker:
>>>>> tcp://S1:61616?wireFormat.
>>>>> maxInactivityDuration=0&jms.useCompression=true&closeAsync=false
Queue
>>>>> Name: ducc.jd.queue.1317
>>>>> Mar 26, 2014 9:49:04 AM org.apache.uima.adapter.jms.client.
>>>>> BaseUIMAAsynchronousEngine_impl initializeConsumer
>>>>> INFO: Initializing JMS Message Consumer. Broker:
>>>>> tcp://S1:61616?wireFormat.
>>>>> maxInactivityDuration=0&jms.useCompression=true&closeAsync=false
Queue
>>>>> Name: ID:S144-36678-1395807465286-7:1:1
>>>>> Mar 26, 2014 9:49:04 AM org.apache.uima.adapter.jms.client.
>>>>> BaseUIMAAsynchronousEngine_impl initialize
>>>>> INFO: Asynchronous Client Has Been Initialized. Serialization Strategy:
>>>>> [SerializationStrategy] Ready To Process.
>>>>>
>>>>> and then only this warning message comes:
>>>>>
>>>>> Mar 26, 2014 9:49:27 AM org.apache.uima.adapter.jms.client.
>>>>> BaseUIMAAsynchronousEngineCommon_impl sendAndReceiveCAS
>>>>> WARNING:
>>>>> then this messages come:
>>>>>
>>>>> Mar 26, 2014 9:59:45 AM org.apache.uima.adapter.jms.client.
>>>>> BaseUIMAAsynchronousEngineCommon_impl stop
>>>>> INFO: Stopping Asynchronous Client.
>>>>> Mar 26, 2014 9:59:45 AM org.apache.uima.adapter.jms.client.
>>>>> BaseUIMAAsynchronousEngineCommon_impl stop
>>>>> INFO: Asynchronous Client Has Stopped.
>>>>> Mar 26, 2014 9:59:45 AM org.apache.uima.adapter.jms.client.
>>>>> BaseUIMAAsynchronousEngineCommon_impl$SharedConnection destroy
>>>>> INFO: UIMA AS Client Shared Connection Has Been Closed  Mar 26, 2014
>>>>> 9:59:45 AM org.apache.uima.adapter.jms.client.
>>>>> BaseUIMAAsynchronousEngine_impl
>>>>> stop
>>>>>
>>>>>
>>>>>
>>>>> 3. Are there any exceptions in any of the logs in the job's log
>>>>> directory?
>>>>>
>>>>> Yes, When this warning message comes then after successfully processing
>>>>> of
>>>>> all documents from DB collection Reader instead of this particular
>>>>> document. This Message shows in one of the Process's log file i.e.:
>>>>>
>>>>> Mar 26, 2014 9:54:04 AM org.apache.uima.adapter.jms.
>>>>> activemq.JmsOutputChannel$ConnectionTimer startSessionReaperTimer.run
>>>>> INFO: Thread: 210 Component: CorefernceAggDescriptor Jms Session
>>>>> Inactivity Timeout: 5 Minutes on Broker: tcp://S1:61616?wireFormat.
>>>>> maxInactivityDuration=0&closeAsync=false
>>>>>
>>>>> I think this is due to that warning.
>>>>>
>>>>>
>>>>> 4. Does your job use a version of UIMA/UIMA-AS that is different than
>>>>> the
>>>>> one used by DUCC?
>>>>>
>>>>> I am using DUCC version 1.0.0 and UIMA version 2.4.2. I am not able to
>>>>> get
>>>>> DUCC UIMA version.
>>>>>
>>>>>
>>>>> --
>>>>> Thanks and Regards,
>>>>> Reshu Agarwal
>>>>> Software Engineer
>>>>> Orkash Services Pvt Ltd
>>>>>
>>>>>
>>>>>
>>>>>  Reshu.
>>>
>>>  Hi Lou,
>
> I am sending the reference of document like the code given below:
>
> String originalText =  v_result.getString("content").toString();
> //v_result is the object of ResultSet of Database
>
> JCas jcas;
>             try {
>                 jcas = aCAS.getJCas();
>             } catch (CASException e) {
>                 throw new CollectionException(e);
>             }
>
> jcas.setDocumentText(originalText);
>
> --
> Thanks,
> Reshu Agarwal
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message