cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefania (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-13037) DropKeyspaceCommitLogRecycleTest.testRecycle times out in 2.1 and 2.2
Date Fri, 16 Dec 2016 08:48:58 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-13037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15753849#comment-15753849
] 

Stefania commented on CASSANDRA-13037:
--------------------------------------

The changes that I made in order to to print debug information in awaitDiskSync() have fixed
the failures, but [some tests|http://cassci.datastax.com/job/stef1927-testall-multiplex/59/testReport/org.apache.cassandra.cql3/]
now take 1 minute longer to complete and in these tests we can see the following errors:

{code}
ERROR 08:28:16 CL disk sync still waiting after 1 minute: segment.lastSyncedOffset 5242880,
position  121
{code}

See for example, [this test|http://cassci.datastax.com/job/stef1927-testall-multiplex/59/testReport/org.apache.cassandra.cql3/DropKeyspaceCommitLogRecycleTest_48/testRecycle/].

These errors  are generated by this modified code:

{code}
        void awaitDiskSync()
        {
            while (segment.lastSyncedOffset < position)
            {
                WaitQueue.Signal signal = segment.syncComplete.register(CommitLog.instance.metrics.waitingOnCommit.time());
                if (segment.lastSyncedOffset < position)
                {
                    do
                    {
                        try
                        {
                            if (signal.awaitUntil(System.nanoTime() + TimeUnit.MINUTES.toNanos(1)))
                                break;

                            logger.error("CL disk sync still waiting after 1 minute: segment.lastSyncedOffset
{}, position  {}",
                                         segment.lastSyncedOffset, position);
                        }
                        catch (InterruptedException t)
                        {
                            logger.error("CL disk sync wait was interrupted: segment.lastSyncedOffset
{}, position  {}",
                                         segment.lastSyncedOffset, position);
                        }

                    } while (segment.lastSyncedOffset < position);

                }
                else
                {
                    signal.cancel();
                }
            }
        }
{code}

Insert of waiting forever in a {{signal.awaitUninterruptibly()}}, it waits for 1 minute and
then it checks if {{segment.lastSyncedOffset < position}}. I'm guessing that a race is
preventing the non-periodic-task thread from being signaled. I still don't understand where
the race is however.

> DropKeyspaceCommitLogRecycleTest.testRecycle times out in 2.1 and 2.2
> ---------------------------------------------------------------------
>
>                 Key: CASSANDRA-13037
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13037
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Testing
>            Reporter: Stefania
>            Assignee: Stefania
>             Fix For: 2.1.x, 2.2.x
>
>
> DropKeyspaceCommitLogRecycleTest.testRecycle times out in 2.1 and 2.2:
> http://cassci.datastax.com/job/cassandra-2.2_testall/589/testReport/junit/org.apache.cassandra.cql3/DropKeyspaceCommitLogRecycleTest/testRecycle/
> http://cassci.datastax.com/job/cassandra-2.1_testall/399/testReport/org.apache.cassandra.cql3/DropKeyspaceCommitLogRecycleTest/testRecycle/
> {code}
> Error Message
> Timeout occurred. Please note the time in the report does not reflect the time until
the timeout.
> Stacktrace
> junit.framework.AssertionFailedError: Timeout occurred. Please note the time in the report
does not reflect the time until the timeout.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message