hadoop-zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremy Stribling <st...@nicira.com>
Subject Re: Getting a "node exists" code on a sequence create
Date Mon, 01 Nov 2010 21:56:08 GMT
Yes, every znode in /zkrsm was created with the sequence flag.

We bring up a cluster of three nodes, though we do it in a slightly odd 
manner to support dynamism: each node starts up as a single-node 
instance knowing only itself, and then each node is contacted by a 
coordinator that kills the ZooKeeperServer object and starts a new 
QuorumPeer object using the full list of three servers.  I know this is 
weird; perhaps this has something to do with it.

Other than the weird setup behavior, we are just writing a few 
sequential records into the system (which all seems to work fine), 
killing one of the nodes (one that has been elected leader via the 
standard recommended ZK leader election algorithm), restarting it, and 
then trying to create more sequential znodes.  I'm guessing this is 
pretty well-tested behavior, so there must be something weird or wrong 
about the way I have stuff setup.

I'm happy to provide whatever logs or snapshots might help someone track 
this down.  Thanks,

Jeremy

On 11/01/2010 02:42 PM, Benjamin Reed wrote:
> how were you able to reproduce it?
>
> all the znodes in /zkrsm were created with the sequence flag. right?
>
> ben
>
> On 11/01/2010 02:28 PM, Jeremy Stribling wrote:
>> We were able to reproduce it.  A "stat" on all three servers looks
>> identical:
>>
>> [zk:<ip:port>(CONNECTED) 0] stat /zkrsm
>> cZxid = 9
>> ctime = Mon Nov 01 13:01:57 PDT 2010
>> mZxid = 9
>> mtime = Mon Nov 01 13:01:57 PDT 2010
>> pZxid = 12884902218
>> cversion = 177
>> dataVersion = 0
>> aclVersion = 0
>> ephemeralOwner = 0
>> dataLength = 0
>> numChildren = 177
>>
>> Creating a sequential node through the command line also fails:
>>
>> [zk:<ip:port>(CONNECTED) 1] create -s /zkrsm/0000000000000000_record
>> testdata
>> Node already exists: /zkrsm/0000000000000000_record
>>
>> One potentially interesting thing is that numChildren above is 177,
>> though I have sequence numbers on that record prefix up to 214 or so.
>> There seem to be some gaps though -- I thin "ls /zkrsm" only shows about
>> 177.  Not sure if that's relevant or not.
>>
>> Thanks,
>>
>> Jeremy
>>
>> On 11/01/2010 12:06 PM, Jeremy Stribling wrote:
>>> Thanks for the reply.  It happened every time we called create, not
>>> just once.  More than that, we tried restarting each of the nodes in
>>> the system (one-by-one), including the new master, and the problem
>>> continued.
>>>
>>> Unfortunately we cleaned everything up, and it's not in that state
>>> anymore.  We haven't yet tried to reproduce, but I will try and report
>>> back if I can get any "cversion" info.
>>>
>>> Jeremy
>>>
>>> On 11/01/2010 11:33 AM, Patrick Hunt wrote:
>>>> Hi Jeremy, this sounds like a bug to me, I don't think you should be
>>>> getting the nodeexists when the sequence flag is set.
>>>>
>>>> Looking at the code briefly we use the parent's "cversion"
>>>> (incremented each time the child list is changed, added/removed).
>>>>
>>>> Did you see this error each time you called create, or just once? If
>>>> you look at the cversion in the Stat of the znode "/zkrsm" on each of
>>>> the servers what does it show? You can use the java CLI to connect to
>>>> each of your servers and access this information. It would be
>>>> interesting to see if the data was out of sync only for a short period
>>>> of time, or forever. Is this repeatable?
>>>>
>>>> Ben/Flavio do you see anything here?
>>>>
>>>> Patrick
>>>>
>>>> On Thu, Oct 28, 2010 at 6:06 PM, Jeremy Stribling<strib@nicira.com>
>>>> wrote:
>>>>> HI everyone,
>>>>>
>>>>> Is there any situation in which creating a new ZK node with the
>>>>> SEQUENCE
>>>>> flag should result in a "node exists" error?  I'm seeing this 
>>>>> happening
>>>>> after a failure of a ZK node that appeared to have been the master;
>>>>> when the
>>>>> new master takes over, my app is unable to create a new SEQUENCE
>>>>> node under
>>>>> an existing parent node.  I'm using Zookeeper 3.2.2.
>>>>>
>>>>> Here's a representative log snippet:
>>>>>
>>>>> --------------------------
>>>>> 3050756 [ProcessThread:-1] TRACE
>>>>> org.apache.zookeeper.server.PrepRequestProcessor  -
>>>>> :Psessionid:0x12bf518350f0001 type:create cxid:0x4cca0691
>>>>> zxid:0xfffffffffffffffe txntype:unknown 
>>>>> /zkrsm/0000000000000000_record
>>>>> 3050756 [ProcessThread:-1] WARN
>>>>> org.apache.zookeeper.server.PrepRequestProcessor  - Got exception 
>>>>> when
>>>>> processing sessionid:0x12bf518350f0001 type:create cxid:0x4cca0691
>>>>> zxid:0xfffffffffffffffe txntype:unknown n/a
>>>>> org.apache.zookeeper.KeeperException$NodeExistsException:
>>>>> KeeperErrorCode =
>>>>> NodeExists
>>>>>          at
>>>>> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcessor.java:245)

>>>>>
>>>>>
>>>>>          at
>>>>> org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.java:114)

>>>>>
>>>>>
>>>>> 3050756 [ProcessThread:-1] DEBUG
>>>>> org.apache.zookeeper.server.quorum.CommitProcessor  - Processing
>>>>> request::
>>>>> sessionid:0x12bf518350f0001 type:create cxid:0x4cca0691
>>>>> zxid:0x50000027e
>>>>> txntype:-1 n/a
>>>>> 3050756 [ProcessThread:-1] DEBUG
>>>>> org.apache.zookeeper.server.quorum.Leader
>>>>>    - Proposing:: sessionid:0x12bf518350f0001 type:create 
>>>>> cxid:0x4cca0691
>>>>> zxid:0x50000027e txntype:-1 n/a
>>>>> 3050756 [SyncThread:0] TRACE
>>>>> org.apache.zookeeper.server.quorum.Leader  -
>>>>> Ack zxid: 0x50000027e
>>>>> 3050757 [SyncThread:0] TRACE
>>>>> org.apache.zookeeper.server.quorum.Leader  -
>>>>> outstanding proposal: 0x50000027e
>>>>> 3050757 [SyncThread:0] TRACE
>>>>> org.apache.zookeeper.server.quorum.Leader  -
>>>>> outstanding proposals all
>>>>> 3050757 [SyncThread:0] DEBUG
>>>>> org.apache.zookeeper.server.quorum.Leader  -
>>>>> Count for zxid: 0x50000027e is 1
>>>>> 3050757 [FollowerHandler-/172.16.0.28:48776] TRACE
>>>>> org.apache.zookeeper.server.quorum.Leader  - Ack zxid: 0x50000027e
>>>>> 3050757 [FollowerHandler-/172.16.0.28:48776] TRACE
>>>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposal:
>>>>> 0x50000027e
>>>>> 3050757 [FollowerHandler-/172.16.0.28:48776] TRACE
>>>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposals 
>>>>> all
>>>>> 3050757 [FollowerHandler-/172.16.0.28:48776] DEBUG
>>>>> org.apache.zookeeper.server.quorum.Leader  - Count for zxid:
>>>>> 0x50000027e is
>>>>> 2
>>>>> 3050757 [FollowerHandler-/172.16.0.28:48776] DEBUG
>>>>> org.apache.zookeeper.server.quorum.CommitProcessor  - Committing
>>>>> request::
>>>>> sessionid:0x12bf518350f0001 type:create cxid:0x4cca0691
>>>>> zxid:0x50000027e
>>>>> txntype:-1 n/a
>>>>> 3050757 [CommitProcessor:0] DEBUG
>>>>> org.apache.zookeeper.server.FinalRequestProcessor  - Processing
>>>>> request::
>>>>> sessionid:0x12bf518350f0001 type:create cxid:0x4cca0691
>>>>> zxid:0x50000027e
>>>>> txntype:-1 n/a
>>>>> 3050757 [CommitProcessor:0] TRACE
>>>>> org.apache.zookeeper.server.FinalRequestProcessor  -
>>>>> :Esessionid:0x12bf518350f0001 type:create cxid:0x4cca0691
>>>>> zxid:0x50000027e
>>>>> txntype:-1 n/a
>>>>> 3050757 [FollowerHandler-/172.16.0.28:41062] TRACE
>>>>> org.apache.zookeeper.server.quorum.Leader  - Ack zxid: 0x50000027e
>>>>> 3050757 [FollowerHandler-/172.16.0.28:41062] TRACE
>>>>> org.apache.zookeeper.server.quorum.Leader  - outstanding proposals 
>>>>> all
>>>>> 3050757 [FollowerHandler-/172.16.0.28:41062] DEBUG
>>>>> org.apache.zookeeper.server.quorum.Leader  - outstanding is 0
>>>>> --------------------------
>>>>>
>>>>> I'm still a n00b at understanding ZK log messages, so maybe there's
>>>>> something obvious going on.  I looked in the JIRA and did my best to
>>>>> search
>>>>> the mailing list archives, but couldn't find anything related to
>>>>> this.  Any
>>>>> ideas?  Thanks very much,
>>>>>
>>>>> Jeremy
>>>>>
>

Mime
View raw message