zookeeper-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexander A. Strelets (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (ZOOKEEPER-2891) SIGABRT from assert during fake completion on zookeeper_close.
Date Sun, 23 Jun 2019 17:28:00 GMT

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-2891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alexander A. Strelets updated ZOOKEEPER-2891:
---------------------------------------------
    Description: 
When I call `zookeeper_close()` while there is a pending `multi` request, I expect the request
completes with `ZCLOSING (-116)` status.

But with the existing code I actually get the following:
- the program exits with `SIGABRT` from `assert(entry)` in `deserialize_multi()`
- and even if I remove this assertion and just break the enclosing loop, the returned status
is `ZOK` but not `ZCLOSING`

So, there are two defects with processing calls to `zookeeper_close()` for pending `multi`
requests: improper assertion in implementation and invalid status in confirmation.

**I propose two changes in the code:**

1. Screen `assert(entry)` in `deserialize_multi()` when it's called due to `zookeeper_close()`
(note that `entry` may normally be equal to `NULL` in this case), and as soon as `entry ==
NULL`, break the loop.
To provide this, `deserialize_multi()` must be informed by the caller weather it's currently
the 'normal' or the 'special' case.
So, I've added a new parameter `rc_hint` (return code hint) to `deserialize_multi()`.
When `deserialize_multi()` is called in 'normal' case, `rc_hint` is preset with `ZOK (0)`,
and the behavior is absolutely the same as with the existing code.
But when it's called due to `zookeeper_close()`, the `rc_hint` is automatically preset with
`ZCLOSING (-116)`, and this changes the behavior of `deserialize_multi()`.

How it works:
Let `zookeeper_close()` is called while there is a pending `multi` request.
Then function `deserialize_multi()` is called for the so-called "Fake response" on `multi`
request which is fabricated by the function `free_completions()`.
Such fake response includes only the header but zero bytes for the body.
Due to this `deserialize_MultiHeader(ia, "multiheader", &mhdr)`, which is called repeatedly
for each `completion_list_t *entry = dequeue_completion(clist)`, does not assign the `mhdr`
and keeps `mhdr.done == 0` as it was originally initialized.
Consequently the `while (!mhdr.done)` loop does not ever end, and finally falls into the `assert(entry)`
with `entry == NULL` when all sub-requests are "completed".
But if the caller made a hint to `deserialize_multi()` that it's actually the 'special' case
(that it processes the fake response indeed, for example), with the proposed changes it would
omit improper assertion and break the loop on the first `entry == NULL`.
Now at least `deserialize_multi()` exits and does not emit SIGABRT.

2. Passthrough the 'return code hint', as specified by the caller, to the `deserialize_multi()`
return code, if the hint is not `ZOK (0)`.

How it works:
With the existing code `deserialize_multi()` returns unsuccessful `rc`-code only if there
is an error in processing of some of subrequests.
And if there are no errors, it returns `ZOK (0)` which is assigned as the default value to
`rc` at the very beginning of the function.
Indeed, in the case of fake multi-response there are no errors in subresponses. So, `deserialize_multi()`
returns `ZOK (0)`.
With `rc = deserialize_multi(xid, cptr, ia)` in `deserialize_response()` it overrides the
true `ZCLOSING` status.
But if the true status (for example, `ZCLOSING`) is initially hinted to `deserialize_multi()`,
as I propose, it would reproduce it back instead of irrelevant `ZOK (0)`.





ZooKeeper C Client *+single thread+* build

Function *_deserialize_multi()_* hits *_assert(entry)_* when called for the so called "Fake
response" which is fabricated by the function _free_completions()_ for example when _zookeeper_close()_
is called while there is a pending _multi_ request.

Such fake response includes only the header but zero bytes for the body. Due to this {{deserialize_MultiHeader(ia,
"multiheader", &mhdr)}}, which is called repeatedly for each {{completion_list_t *entry
= dequeue_completion(clist)}}, does not assign the _mhdr_ and keeps _mhdr.done == 0_ as it
was originally initialized. Consequently the _while (!mhdr.done)_ does not ever end, and finally
falls into the _assert(entry)_ with _entry == NULL_ when all sub-requests are "completed".
~// Normally on my platform assert raises SIGABRT.~

I propose to instruct the _deserialize_multi()_ function to break the loop on _entry == NULL_
if it was called for an unsuccessfull overal status of the multi response, and in particular
for the fake response having _ZCLOSING_ (-116) status. I have introduced the _rc0_ parameter
for this.


*Another issue* with this function is that even if the while-loop exited properly, this function
returns _rc == 0_, and this return code +overrides+ the true status value with {{rc = deserialize_multi(xid,
cptr, ia, rc)}} in the _deserialize_response()_ function. So, the _multi_ response callback
+handler would be called with _rc == ZOK_ instead of _rc == ZCLOSING_+ which is strictly wrong.

To fix this I propose initializing _rc_ with the introduced _rc0_ instead of zero (which is
_ZOK_ indeed).

This is a proposed fix: https://github.com/apache/zookeeper/pull/360

[upd]
It looks like about the same problem is described in ZOOKEEPER-1636
However, the patch proposed in this ticket also remedies the second linked problem: reporting
ZCLOSING status (as required) to the multi-request completion handler.

  was:
ZooKeeper C Client *+single thread+* build

Function *_deserialize_multi()_* hits *_assert(entry)_* when called for the so called "Fake
response" which is fabricated by the function _free_completions()_ for example when _zookeeper_close()_
is called while there is a pending _multi_ request.

Such fake response includes only the header but zero bytes for the body. Due to this {{deserialize_MultiHeader(ia,
"multiheader", &mhdr)}}, which is called repeatedly for each {{completion_list_t *entry
= dequeue_completion(clist)}}, does not assign the _mhdr_ and keeps _mhdr.done == 0_ as it
was originally initialized. Consequently the _while (!mhdr.done)_ does not ever end, and finally
falls into the _assert(entry)_ with _entry == NULL_ when all sub-requests are "completed".
~// Normally on my platform assert raises SIGABRT.~

I propose to instruct the _deserialize_multi()_ function to break the loop on _entry == NULL_
if it was called for an unsuccessfull overal status of the multi response, and in particular
for the fake response having _ZCLOSING_ (-116) status. I have introduced the _rc0_ parameter
for this.


*Another issue* with this function is that even if the while-loop exited properly, this function
returns _rc == 0_, and this return code +overrides+ the true status value with {{rc = deserialize_multi(xid,
cptr, ia, rc)}} in the _deserialize_response()_ function. So, the _multi_ response callback
+handler would be called with _rc == ZOK_ instead of _rc == ZCLOSING_+ which is strictly wrong.

To fix this I propose initializing _rc_ with the introduced _rc0_ instead of zero (which is
_ZOK_ indeed).

This is a proposed fix: https://github.com/apache/zookeeper/pull/360

[upd]
It looks like about the same problem is described in ZOOKEEPER-1636
However, the patch proposed in this ticket also remedies the second linked problem: reporting
ZCLOSING status (as required) to the multi-request completion handler.


> SIGABRT from assert during fake completion on zookeeper_close.
> --------------------------------------------------------------
>
>                 Key: ZOOKEEPER-2891
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2891
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.4.10
>         Environment: Linux ubuntu 4.4.0-87-generic
> gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
> https://github.com/apache/zookeeper.git
> branch-3.4
>            Reporter: Alexander A. Strelets
>            Assignee: Alexander A. Strelets
>            Priority: Critical
>              Labels: easyfix, pull-request-available
>             Fix For: 3.4.10
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> When I call `zookeeper_close()` while there is a pending `multi` request, I expect the
request completes with `ZCLOSING (-116)` status.
> But with the existing code I actually get the following:
> - the program exits with `SIGABRT` from `assert(entry)` in `deserialize_multi()`
> - and even if I remove this assertion and just break the enclosing loop, the returned
status is `ZOK` but not `ZCLOSING`
> So, there are two defects with processing calls to `zookeeper_close()` for pending `multi`
requests: improper assertion in implementation and invalid status in confirmation.
> **I propose two changes in the code:**
> 1. Screen `assert(entry)` in `deserialize_multi()` when it's called due to `zookeeper_close()`
(note that `entry` may normally be equal to `NULL` in this case), and as soon as `entry ==
NULL`, break the loop.
> To provide this, `deserialize_multi()` must be informed by the caller weather it's currently
the 'normal' or the 'special' case.
> So, I've added a new parameter `rc_hint` (return code hint) to `deserialize_multi()`.
> When `deserialize_multi()` is called in 'normal' case, `rc_hint` is preset with `ZOK
(0)`, and the behavior is absolutely the same as with the existing code.
> But when it's called due to `zookeeper_close()`, the `rc_hint` is automatically preset
with `ZCLOSING (-116)`, and this changes the behavior of `deserialize_multi()`.
> How it works:
> Let `zookeeper_close()` is called while there is a pending `multi` request.
> Then function `deserialize_multi()` is called for the so-called "Fake response" on `multi`
request which is fabricated by the function `free_completions()`.
> Such fake response includes only the header but zero bytes for the body.
> Due to this `deserialize_MultiHeader(ia, "multiheader", &mhdr)`, which is called
repeatedly for each `completion_list_t *entry = dequeue_completion(clist)`, does not assign
the `mhdr` and keeps `mhdr.done == 0` as it was originally initialized.
> Consequently the `while (!mhdr.done)` loop does not ever end, and finally falls into
the `assert(entry)` with `entry == NULL` when all sub-requests are "completed".
> But if the caller made a hint to `deserialize_multi()` that it's actually the 'special'
case (that it processes the fake response indeed, for example), with the proposed changes
it would omit improper assertion and break the loop on the first `entry == NULL`.
> Now at least `deserialize_multi()` exits and does not emit SIGABRT.
> 2. Passthrough the 'return code hint', as specified by the caller, to the `deserialize_multi()`
return code, if the hint is not `ZOK (0)`.
> How it works:
> With the existing code `deserialize_multi()` returns unsuccessful `rc`-code only if there
is an error in processing of some of subrequests.
> And if there are no errors, it returns `ZOK (0)` which is assigned as the default value
to `rc` at the very beginning of the function.
> Indeed, in the case of fake multi-response there are no errors in subresponses. So, `deserialize_multi()`
returns `ZOK (0)`.
> With `rc = deserialize_multi(xid, cptr, ia)` in `deserialize_response()` it overrides
the true `ZCLOSING` status.
> But if the true status (for example, `ZCLOSING`) is initially hinted to `deserialize_multi()`,
as I propose, it would reproduce it back instead of irrelevant `ZOK (0)`.
> ZooKeeper C Client *+single thread+* build
> Function *_deserialize_multi()_* hits *_assert(entry)_* when called for the so called
"Fake response" which is fabricated by the function _free_completions()_ for example when
_zookeeper_close()_ is called while there is a pending _multi_ request.
> Such fake response includes only the header but zero bytes for the body. Due to this
{{deserialize_MultiHeader(ia, "multiheader", &mhdr)}}, which is called repeatedly for
each {{completion_list_t *entry = dequeue_completion(clist)}}, does not assign the _mhdr_
and keeps _mhdr.done == 0_ as it was originally initialized. Consequently the _while (!mhdr.done)_
does not ever end, and finally falls into the _assert(entry)_ with _entry == NULL_ when all
sub-requests are "completed". ~// Normally on my platform assert raises SIGABRT.~
> I propose to instruct the _deserialize_multi()_ function to break the loop on _entry
== NULL_ if it was called for an unsuccessfull overal status of the multi response, and in
particular for the fake response having _ZCLOSING_ (-116) status. I have introduced the _rc0_
parameter for this.
> *Another issue* with this function is that even if the while-loop exited properly, this
function returns _rc == 0_, and this return code +overrides+ the true status value with {{rc
= deserialize_multi(xid, cptr, ia, rc)}} in the _deserialize_response()_ function. So, the
_multi_ response callback +handler would be called with _rc == ZOK_ instead of _rc == ZCLOSING_+
which is strictly wrong.
> To fix this I propose initializing _rc_ with the introduced _rc0_ instead of zero (which
is _ZOK_ indeed).
> This is a proposed fix: https://github.com/apache/zookeeper/pull/360
> [upd]
> It looks like about the same problem is described in ZOOKEEPER-1636
> However, the patch proposed in this ticket also remedies the second linked problem: reporting
ZCLOSING status (as required) to the multi-request completion handler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message