accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christopher Tubbs (JIRA)" <>
Subject [jira] [Reopened] (ACCUMULO-4533) TraceServer should not abort if trace table exists
Date Thu, 15 Dec 2016 17:58:58 GMT


Christopher Tubbs reopened ACCUMULO-4533:

Re-opening because the change seems to cause the tracer service to die regularly on a normal

This appears to be because the tracer service starts faster than the master, and that prevents
the tracer service from creating a trace table until the master is fully started.

Once the master is up, calling again to start the tracer works fine.

We should be able to tell whether the failure is temporal and we should retry, in the case
of the master not yet running, or if the failure is permanent and there's no chance of success,
like if the tracer credentials are insufficient to create a table.

> TraceServer should not abort if trace table exists
> --------------------------------------------------
>                 Key: ACCUMULO-4533
>                 URL:
>             Project: Accumulo
>          Issue Type: Bug
>          Components: trace
>    Affects Versions: 1.7.1, 1.7.2, 1.8.0
>         Environment: impacts 1.7.0-1.7.2, 1.8.0
>            Reporter: Sean Busbey
>            Assignee: Sean Busbey
>             Fix For: 1.7.3, 1.8.1, 2.0.0
>         Attachments: ACCUMULO-4533-1.7.v1.patch, ACCUMULO-4533-1.7.v2.patch
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
> h3. description
> On start up, the TraceServer attempts to ensure that the trace table exists.
> {code}
>         connector = serverConfiguration.getInstance().getConnector(principal, at);
>         if (!connector.tableOperations().exists(tableName)) {
>           connector.tableOperations().create(tableName);
>           IteratorSetting setting = new IteratorSetting(10, "ageoff", AgeOffFilter.class.getName());
>           AgeOffFilter.setTTL(setting, 7 * 24 * 60 * 60 * 1000l);
>           connector.tableOperations().attachIterator(tableName, setting);
>         }
> {code}
> The race condition between checking existence and creating the table ought not matter,
since we're in a big loop that is supposed to retry on any problems.
> However, that loop expressly catches {{RuntimeException}} and {{TableExistsException}}
is not a {{RuntimeException}} so currently the exception propagates and kills the server.
> h3. workaround
> restart any failed trace servers, since the one that won the race condition should have
finished set up properly.
> alternatively, manually create the trace table prior to starting any trace servers.

This message was sent by Atlassian JIRA

View raw message