accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christopher Tubbs (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-2140) Race conditions between client operations and upgrade
Date Fri, 31 Jan 2014 19:24:11 GMT


Christopher Tubbs commented on ACCUMULO-2140:

I did see a new trace table get re-created on upgrade, because it wasn't in the default namespace
and was therefore not visible to the client. However, I fixed the bug in the code that wasn't
putting it in the correct namespace.

It was certainly the case that it was caused by one kind of race condition: the tracer client
had a different view of zookeeper while waiting on the upgrade to occur and the client thought
the table didn't exist, so it sent a request to create it.

However, I now realize that the table was not re-created *during* the upgrade, but after it,
because the RPC (which probably waited on the master's client service being available). It's
not clear to me now why the master would've allowed this request to complete, though, but
I don't think it's possible anymore, as I haven't seen it since.

> Race conditions between client operations and upgrade
> -----------------------------------------------------
>                 Key: ACCUMULO-2140
>                 URL:
>             Project: Accumulo
>          Issue Type: Bug
>            Reporter: Christopher Tubbs
>            Assignee: Christopher Tubbs
>            Priority: Blocker
>             Fix For: 1.6.0
> While the master is upgrading, it also has a thread that is responding to client requests.
Since the upgrade renames tables and puts them in namespaces, there is a short period of time
where table existence checks that rely on the new zookeeper schema for tables are failing
to provide the correct answer.
> Example: when the tracer starts, it tries to create a "trace" table, if it doesn't exist.
The existence check returns false, so it creates a new trace table in the default namespace,
even though there exists an old one that has not yet been moved into the default namespace
during the upgrade. This results in two tables with the same name.
> An easy solution would be to fail to respond to client requests until after the upgrade
is complete. (eg. wait to start up the MasterClientServiceHandler thread).

This message was sent by Atlassian JIRA

View raw message