zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ZOOKEEPER-3113) EphemeralType.get() fails to verify ephemeralOwner when currentElapsedTime() is small enough
Date Thu, 18 Oct 2018 16:13:00 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-3113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16655504#comment-16655504

Hudson commented on ZOOKEEPER-3113:

FAILURE: Integrated in Jenkins build ZooKeeper-trunk #238 (See [https://builds.apache.org/job/ZooKeeper-trunk/238/])
ZOOKEEPER-3113: EphemeralType.get() fails to verify ephemeralOwner when (andor: rev 5c85a236c9eb0805ea8389a52dab3b1bc0efadac)
* (edit) zookeeper-server/src/test/java/org/apache/zookeeper/server/EphemeralTypeTest.java
* (edit) zookeeper-server/src/test/java/org/apache/zookeeper/server/Emulate353TTLTest.java

> EphemeralType.get() fails to verify ephemeralOwner when currentElapsedTime() is small
> --------------------------------------------------------------------------------------------
>                 Key: ZOOKEEPER-3113
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3113
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.5.4, 3.6.0
>            Reporter: Andor Molnar
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 3.6.0, 3.5.5
>          Time Spent: 4h
>  Remaining Estimate: 0h
> EphemeralTypeTest.testServerIds() unit test fails on some systems that System.nanoTime()
is smaller than a certain value.
> The test generates ephemeralOwner in the old way (pre ZOOKEEPER-2901) without enabling
the emulation flag and asserts for exception to be thrown when serverId == 255. This is right.
ZooKeeper should fail on this case, because serverId cannot be larger than 254 if extended
types are enabled. In this case ephemeralOwner with 0xff in the most significant byte indicates
an extended type.
> The logic which does the validation is in EphemeralType.get().
> It checks 2 things:
>  * the extended type byte is set: 0xff,
>  * reserved bits (next 2 bytes) corresponds to a valid extended type.
> Here is the problem: currently we only have 1 extended type: TTL with value of 0x0000
in the reserved bits.
> Logic expects that if we have anything different from it in the reserved bits, the ephemeralOwner
is invalid and exception should be thrown. That's what the test asserts for and it works on
most systems, because the timestamp part of the sessionId usually have some bits in the reserved
bits as well which eventually will be larger than 0, so the value is unsupported.
> I think the problem is twofold:
>  * Either if we have more extended types, we'll increase the possibility that this logic
will accept invalid sessionIds (as long as reserved bits indicate a valid extended type),
>  * Or (which happens on some systems) if the currentElapsedTime (timestamp part of sessionId)
is small enough and doesn't occupy reserved bits, this logic will accept the invalid sessionId.
> Unfortunately I cannot repro the problem yet: it constantly happens on a specific Jenkins
slave, but even with the same distro and same JDK version I cannot reproduce the same nanoTime()

This message was sent by Atlassian JIRA

View raw message