uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Suppe <ssu...@llnl.gov>
Subject Re: Server Socket Timeout Woes
Date Wed, 23 Apr 2008 16:59:16 GMT
Marshall,

Thanks for the response.  We're running 2.2.1.  I had this nagging 
suspicion that maybe I was treating the symptoms, not the problem, but at 
this point I'm out of ideas :)  My next attempt will be to bring the 
threads and pool size down to something more manageable.  This is supposed 
to be 'rev2' of our design, hardware wise.  I'll go back to my original CPE 
settings (lower count, etc) from rev1, and hope for the best.

Thanks again, I will keep you posted.
Steve

At 05:00 PM 4/22/2008, you wrote:
>Hi Steve -
>
>I'm no expert in these matter, but I wonder if changing the timeouts is 
>the right approach.  Have you isolated the problem to something wrong with 
>the timeouts?  Could it be something else (some rare race condition 
>causing a hang at some point, for intance)?
>What level of UIMA are you running?
>
>-Marshall
>
>Steve Suppe wrote:
>>Hi all,
>>
>>Thanks so much for this list - I'm constantly lurking and learning things :)
>>
>>I'm having trouble with our distributed cluster - our setup is as follows:
>>
>>We have a 'reader' node reading from the local FS, 15 'worker' nodes each 
>>running identical aggregates of analysis and consumers that connect to an 
>>oracle DB for final storing of data results.  On each worker I have 
>>multiple instances running, typically 32, so I have 15x32 connections to 
>>Oracle.  I have about 20,000,000 documents to process.
>>
>>After a certain amount of time, I start to get Broken Pipe server socket 
>>exceptions, of the following:
>>
>>4/21/08 5:40:24 PM - 11: 
>>org.apache.uima.adapter.vinci.CASTransportable.toStream(288): WARNING: 
>>Broken pipe
>>java.net.SocketException: Broken pipe
>>         at java.net.SocketOutputStream.socketWrite0(Native Method)
>>         at 
>> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
>>         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>>         at 
>> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>>         at java.io.BufferedOutputStream.write(BufferedOutputStream.java:78)
>>         at 
>> org.apache.vinci.transport.XTalkTransporter.writeInt(XTalkTransporter.java:508) 
>>
>>         at 
>> org.apache.vinci.transport.XTalkTransporter.stringToBin(XTalkTransporter.java:446)

>>
>>         at 
>> org.apache.uima.adapter.vinci.CASTransportable$XTalkSerializer.startElement(CASTransportable.java:219)

>>
>>         at 
>> org.apache.uima.cas.impl.XCASSerializer$XCASDocSerializer.startElement(XCASSerializer.java:327)

>>
>>         at 
>> org.apache.uima.cas.impl.XCASSerializer$XCASDocSerializer.encodeFS(XCASSerializer.java:466)

>>
>>         at 
>> org.apache.uima.cas.impl.XCASSerializer$XCASDocSerializer.encodeIndexed(XCASSerializer.java:347)

>>
>>         at 
>> org.apache.uima.cas.impl.XCASSerializer$XCASDocSerializer.serialize(XCASSerializer.java:271)

>>
>>         at 
>> org.apache.uima.cas.impl.XCASSerializer$XCASDocSerializer.access$600(XCASSerializer.java:62)

>>
>>         at 
>> org.apache.uima.cas.impl.XCASSerializer.serialize(XCASSerializer.java:919)
>>         at 
>> org.apache.uima.adapter.vinci.CASTransportable.toStream(CASTransportable.java:279)

>>
>>         at 
>> org.apache.vinci.transport.BaseServerRunnable.run(BaseServerRunnable.java:90)
>>         at 
>> org.apache.vinci.transport.BaseServer$PooledThread.run(BaseServer.java:101)
>>
>>and
>>
>>org.apache.uima.collection.impl.base_cpm.container.ServiceConnectionException: 
>>The service did not complete a call within the specified time. (Thread 
>>Name: [Procesing Pipeline#172 Thread]::) Host: 192.168.3.52 Port: 11000 
>>Exceeded Timeout Value: 600000
>>         at 
>> org.apache.uima.collection.impl.cpm.container.deployer.VinciTAP.sendAndReceive(VinciTAP.java:533)

>>
>>         at 
>> org.apache.uima.collection.impl.cpm.container.deployer.VinciTAP.analyze(VinciTAP.java:927)

>>
>>         at 
>> org.apache.uima.collection.impl.cpm.container.NetworkCasProcessorImpl.process(NetworkCasProcessorImpl.java:198)

>>
>>         at 
>> org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext(ProcessingUnit.java:1071)

>>
>>         at 
>> org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(ProcessingUnit.java:668)

>>
>>org.apache.uima.resource.ResourceProcessException
>>         at 
>> org.apache.uima.collection.impl.cpm.container.NetworkCasProcessorImpl.process(NetworkCasProcessorImpl.java:200)

>>
>>         at 
>> org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext(ProcessingUnit.java:1071)

>>
>>         at 
>> org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(ProcessingUnit.java:668)

>>
>>
>>
>>
>>I've found that if I lower my <timeout> for each casProcessor too low, 
>>(and also my maxConsecutiveRestarts), I only get about 70000 documents in 
>>before the whole thing goes sour.  If I raise everything to obscenely 
>>high (say, 1,000,000,000 ms), then I get about 400,000 in.
>>The the whole system freezes, and nothing gets into Oracle.  I keep my 
>>vinci descriptor for the VNS server at unlimited, and the 
>>serverSocketTimeout for the Vinci descriptor for the CPE obscenely high 
>>as well.
>>
>>I don't know if I'm adequately explaining my problem, but I'm trying to 
>>figure out the best way to set my timeouts on the CPE and the Vinci 
>>descriptors as well.
>>
>>My next attempt is to keep the timeouts from the CPE side very high, the 
>>Vinci VNS descriptor unlimited, and the serverSocketTimeout at 30000ms.
>>
>>I guess, overall, I would like to give ample time to let an AE work, but 
>>not so long it never returns.  This includes the fact that since I have 
>>32x15 processingUnitThreadCounts, I need the timeout to be large enough 
>>at initialization.
>>
>>Sorry for the rambling, does anyone have any general 
>>guidelines/experiences for this kind of setup?
>>
>>Thanks!
>>
>>Steve
>>
>>


Mime
View raw message