zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: flood of "outstanding proposal" messages
Date Thu, 07 Jul 2011 21:34:16 GMT
When you are stressing the system outstanding proposals can backup,
the entirety of which will be logged at TRACE level each time an ack
is processed. Alone this should not cause any problems other than the
large logs you highlighted in your original post.

Patrick

On Thu, Jul 7, 2011 at 2:22 PM, Jeremy Stribling <strib@nicira.com> wrote:
> Yes, thanks, we definitely will.  During this test we were trying to track
> down some other problem we were having, which is why the debugging level is
> turned up.  Are you saying that whatever underlying issue caused these
> hundreds of megabytes of messages would only happen in TRACE mode?
>
> On 07/07/2011 02:12 PM, Patrick Hunt wrote:
>>
>> Hi Jeremy, you're running with TRACE logging turned on which can be
>> very verbose, I'd suggest you run with INFO level in normal
>> circumstances.
>>
>> Patrick
>>
>> On Thu, Jul 7, 2011 at 10:40 AM, Jeremy Stribling<strib@nicira.com>
>>  wrote:
>>
>>>
>>> Has anyone ever seen Zookeeper explode with a flood of "outstanding
>>> proposal" messages?  I'm running 3.3.3* with three nodes (under fairly
>>> stressful client load), and all of the sudden one of the nodes' logs
>>> starts
>>> filling up with these messages:
>>>
>>> 2011-07-06 18:37:51,161 628684 [ProcessThread:-1] TRACE
>>> org.apache.zookeeper.server.PrepRequestProcessor  -
>>> :Psessionid:0x5d310234eaf10002 type:delete cxid:0x4e15574f
>>> zxid:0xfffffffffffffffe txntype:unknown
>>>
>>> reqpath:/zkrsm/cpt-0000000000000009-000000000000afc8/0000000000000009_record0000003293
>>> 2011-07-06 18:37:51,162 628685 [ProcessThread:-1] DEBUG
>>> org.apache.zookeeper.server.quorum.CommitProcessor  - Processing
>>> request::
>>> sessionid:0x5d310234eaf10002 type:delete cxid:0x4e15574f zxid:0x40000121c
>>> txntype:2 reqpath:n/a
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - Ack zxid: 0x4000011fb
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x400001204
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x40000120d
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x400001207
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x400001215
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x4000011fe
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x400001208
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x400001214
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x400001219
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x400001206
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x400001201
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x400001213
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x4000011ff
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x40000121a
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x400001200
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x40000121b
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x400001209
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x400001212
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x4000011fc
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x400001211
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x400001203
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x40000120a
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x4000011fd
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x400001210
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x400001218
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x400001202
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x40000120b
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x400001216
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x40000120f
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x400001217
>>> 2011-07-06 18:37:51,162 628685 [SyncThread:177] TRACE
>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>> 0x40000120c
>>>
>>> It goes on like that for hundreds of megabytes over a few minutes, until
>>> we
>>> killed the test.  As far as I can tell, there were no node churn events
>>> that
>>> sparked this.
>>>
>>> I can provide the full set of logs and/or file a JIRA, but I thought I'd
>>> do
>>> a quick check first to see if this sparked anyone's memory.
>>>
>>> Thanks,
>>>
>>> Jeremy
>>>
>>> * I have applied a few patches on top of 3.3.3, namely a homegrown,
>>> temporary patch for ZOOKEEPER-1046 and one for ZOOKEEPER-1060.
>>>
>>>
>>>
>

Mime
View raw message