incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Lebresne <sylv...@datastax.com>
Subject Re: C* 2.1-rc2 gets unstable after a 'DROP KEYSPACE' command ?
Date Thu, 07 Aug 2014 09:44:50 GMT
It would be nice if you can try with 2.1.0-rc5 (there has been quite a bit
of bug fixes since rc2). If you can still reproduce that NPE there, please
do open a jira ticket with the reproduction steps.


On Thu, Aug 7, 2014 at 11:29 AM, Fabrice Larcher <fabrice.larcher@level5.fr>
wrote:

> Hello,
>
> After a 'DROP TABLE' command that returns "errors={}, last_host=127.0.0.1"
> (like most DROP commands do) from CQLSH with C* 2.1.0-rc2, I stopped C*.
> And I can not start one node. It says :
> ERROR 09:18:34 Exception encountered during startup
> java.lang.NullPointerException: null
>     at org.apache.cassandra.db.Directories.<init>(Directories.java:191)
> ~[apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
>     at
> org.apache.cassandra.db.ColumnFamilyStore.removeUnfinishedCompactionLeftovers(ColumnFamilyStore.java:553)
> ~[apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
>     at
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:245)
> [apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
>     at
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:455)
> [apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
>     at
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:544)
> [apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
> java.lang.NullPointerException
>     at org.apache.cassandra.db.Directories.<init>(Directories.java:191)
>     at
> org.apache.cassandra.db.ColumnFamilyStore.removeUnfinishedCompactionLeftovers(ColumnFamilyStore.java:553)
>     at
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:245)
>     at
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:455)
>     at
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:544)
> Exception encountered during startup: null
>
> I do not now if it can help.
>
>
> Fabrice LARCHER
>
>
> 2014-07-18 7:23 GMT+02:00 Fabrice Larcher <fabrice.larcher@level5.fr>:
>
> Hello,
>>
>> I still experience a similar issue after a 'DROP KEYSPACE' command with
>> C* 2.1-rc3. Connection to the node may fail after a 'DROP'.
>>
>> But I did not see this issue with 2.1-rc1 (-> it seems like to be a
>> regression brought with 2.1-rc2).
>>
>> Fabrice LARCHER
>>
>>
>> 2014-07-17 9:19 GMT+02:00 Benedict Elliott Smith <
>> belliottsmith@datastax.com>:
>>
>> Also https://issues.apache.org/jira/browse/CASSANDRA-7437 and
>>> https://issues.apache.org/jira/browse/CASSANDRA-7465 for rc3, although
>>> the CounterCacheKey assertion looks like an independent (though
>>> comparatively benign) bug I will file a ticket for.
>>>
>>> Can you try this against rc3 to see if the problem persists? You may see
>>> the last exception, but it shouldn't affect the stability of the cluster.
>>> If either of the other exceptions persist, please file a ticket.
>>>
>>>
>>> On Thu, Jul 17, 2014 at 1:41 AM, Tyler Hobbs <tyler@datastax.com> wrote:
>>>
>>>> This looks like https://issues.apache.org/jira/browse/CASSANDRA-6959,
>>>> but that was fixed for 2.1.0-rc1.
>>>>
>>>> Is there any chance you can put together a script to reproduce the
>>>> issue?
>>>>
>>>>
>>>> On Thu, Jul 10, 2014 at 8:51 AM, Pavel Kogan <pavel.kogan@cortica.com>
>>>> wrote:
>>>>
>>>>> It seems that memtable tries to flush itself to SSTable of not
>>>>> existing keyspace. I don't know why it is happens, but probably running
>>>>> nodetool flush before drop should prevent this issue.
>>>>>
>>>>> Pavel
>>>>>
>>>>>
>>>>> On Thu, Jul 10, 2014 at 4:09 AM, Fabrice Larcher <
>>>>> fabrice.larcher@level5.fr> wrote:
>>>>>
>>>>>> ​Hello,
>>>>>>
>>>>>> I am using the 'development' version 2.1-rc2.
>>>>>>
>>>>>> With one node (=localhost), I get timeouts trying to connect to C*
>>>>>> after running a 'DROP KEYSPACE' command. I have following error messages
in
>>>>>> system.log :
>>>>>>
>>>>>> INFO  [SharedPool-Worker-3] 2014-07-09 16:29:36,578
>>>>>> MigrationManager.java:319 - Drop Keyspace 'test_main'
>>>>>> (...)
>>>>>> ERROR [MemtableFlushWriter:6] 2014-07-09 16:29:37,178
>>>>>> CassandraDaemon.java:166 - Exception in thread
>>>>>> Thread[MemtableFlushWriter:6,5,main]
>>>>>> java.lang.RuntimeException: Last written key
>>>>>> DecoratedKey(91e7f660-076f-11e4-a36d-28d2444c0b1b,
>>>>>> 52446dde90244ca49789b41671e4ca7c) >= current key
>>>>>> DecoratedKey(91e7f660-076f-11e4-a36d-28d2444c0b1b,
>>>>>> 52446dde90244ca49789b41671e4ca7c) writing into
>>>>>> ./../data/data/test_main/user-911d5360076f11e4812d3d4ba97474ac/test_main-user.user_account-tmp-ka-1-Data.db
>>>>>>         at
>>>>>> org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:172)
>>>>>> ~[apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
>>>>>>         at
>>>>>> org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:215)
>>>>>> ~[apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
>>>>>>         at
>>>>>> org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:351)
>>>>>> ~[apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
>>>>>>         at
>>>>>> org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:314)
>>>>>> ~[apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
>>>>>>         at
>>>>>> org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
>>>>>> ~[apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
>>>>>>         at
>>>>>> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>>>>>> ~[apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
>>>>>>         at
>>>>>> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
>>>>>> ~[guava-16.0.jar:na]
>>>>>>         at
>>>>>> org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1054)
>>>>>> ~[apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
>>>>>>         at
>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>>> ~[na:1.7.0_55]
>>>>>>         at
>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>>> ~[na:1.7.0_55]
>>>>>>         at java.lang.Thread.run(Thread.java:744) ~[na:1.7.0_55]
>>>>>>
>>>>>> Then, I can not connect to the Cluster anymore from my app (Java
>>>>>> Driver 2.1-SNAPSHOT) and got in application logs :
>>>>>>
>>>>>> com.datastax.driver.core.exceptions.NoHostAvailableException: All
>>>>>> host(s) tried for query failed (tried: /127.0.0.1:9042
>>>>>> (com.datastax.driver.core.exceptions.DriverException: Timeout during
read))
>>>>>>     at
>>>>>> com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:65)
>>>>>>     at
>>>>>> com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:258)
>>>>>>     at
>>>>>> com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:174)
>>>>>>     at
>>>>>> com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:52)
>>>>>>     at
>>>>>> com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:36)
>>>>>>     (...)
>>>>>> Caused by:
>>>>>> com.datastax.driver.core.exceptions.NoHostAvailableException: All
host(s)
>>>>>> tried for query failed (tried: /127.0.0.1:9042
>>>>>> (com.datastax.driver.core.exceptions.DriverException: Timeout during
read))
>>>>>>     at
>>>>>> com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:103)
>>>>>>     at
>>>>>> com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:175)
>>>>>>     at
>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>>>     at
>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>>>     at java.lang.Thread.run(Thread.java:745)
>>>>>>
>>>>>> I can still connect through CQLSH but if I run (again) a "DROP
>>>>>> KEYSPACE" command from CQLSH, I get the following error :
>>>>>> errors={}, last_host=127.0.0.1
>>>>>>
>>>>>> Now, on a 2 nodes cluster I also have a similar issue but the error's
>>>>>> stacktrace is different :
>>>>>>
>>>>>> From application logs :
>>>>>>
>>>>>> 17971 [Cassandra Java Driver worker-2] WARN
>>>>>> com.datastax.driver.core.Cluster  - No schema agreement from live
replicas
>>>>>> after 10000 ms. The schema may not be up to date on some nodes.
>>>>>>
>>>>>> From system.log :
>>>>>>
>>>>>> INFO  [SharedPool-Worker-2] 2014-07-10 09:04:53,434
>>>>>> MigrationManager.java:319 - Drop Keyspace 'test_main'
>>>>>> (...)
>>>>>> ERROR [MigrationStage:1] 2014-07-10 09:04:56,553
>>>>>> CommitLogSegmentManager.java:304 - Failed waiting for a forced recycle
of
>>>>>> in-use commit log segments
>>>>>> java.lang.AssertionError: null
>>>>>>         at
>>>>>> org.apache.cassandra.db.commitlog.CommitLogSegmentManager.forceRecycleAll(CommitLogSegmentManager.java:299)
>>>>>> ~[apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
>>>>>>         at
>>>>>> org.apache.cassandra.db.commitlog.CommitLog.forceRecycleAllSegments(CommitLog.java:160)
>>>>>> [apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
>>>>>>         at
>>>>>> org.apache.cassandra.db.DefsTables.dropColumnFamily(DefsTables.java:516)
>>>>>> [apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
>>>>>>         at
>>>>>> org.apache.cassandra.db.DefsTables.mergeColumnFamilies(DefsTables.java:300)
>>>>>> [apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
>>>>>>         at
>>>>>> org.apache.cassandra.db.DefsTables.mergeSchemaInternal(DefsTables.java:187)
>>>>>> [apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
>>>>>>         at
>>>>>> org.apache.cassandra.db.DefsTables.mergeSchema(DefsTables.java:164)
>>>>>> [apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
>>>>>>         at
>>>>>> org.apache.cassandra.service.MigrationManager$2.runMayThrow(MigrationManager.java:393)
>>>>>> [apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
>>>>>>         at
>>>>>> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>>>>>> [apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
>>>>>>         at
>>>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>>>> [na:1.7.0_45]
>>>>>>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>>>> [na:1.7.0_45]
>>>>>>         at
>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>>> [na:1.7.0_45]
>>>>>>         at
>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>>> [na:1.7.0_45]
>>>>>>         at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
>>>>>>
>>>>>>
>>>>>> In both cases, I have to restart the C* server, so that I can connect
>>>>>> again from my application. The connection from CQLSH is still possible
but
>>>>>> some key-space related commands fail.
>>>>>>
>>>>>> NB : I also get sometimes the following error on my nodes (I do not
>>>>>> know if there is any link with the topic) :
>>>>>> ERROR [CompactionExecutor:62] 2014-07-10 08:49:02,059
>>>>>> CassandraDaemon.java:166 - Exception in thread
>>>>>> Thread[CompactionExecutor:62,1,main]
>>>>>> java.lang.NullPointerException: null
>>>>>>         at
>>>>>> org.apache.cassandra.cache.CounterCacheKey.getPathInfo(CounterCacheKey.java:53)
>>>>>> ~[apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
>>>>>>         at
>>>>>> org.apache.cassandra.cache.AutoSavingCache$Writer.saveCache(AutoSavingCache.java:215)
>>>>>> ~[apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
>>>>>>         at
>>>>>> org.apache.cassandra.db.compaction.CompactionManager$11.run(CompactionManager.java:1052)
>>>>>> ~[apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
>>>>>>         at
>>>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>>>> ~[na:1.7.0_55]
>>>>>>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>>>> ~[na:1.7.0_55]
>>>>>>         at
>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>>> ~[na:1.7.0_55]
>>>>>>         at
>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>>> [na:1.7.0_55]
>>>>>>         at java.lang.Thread.run(Thread.java:744) [na:1.7.0_55]
>>>>>>
>>>>>> It seems to be a bug (is it known ?). Or do you have any clue on
how
>>>>>> I can solve that ?
>>>>>>
>>>>>> Thanks for your help
>>>>>>
>>>>>> Fabrice
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Tyler Hobbs
>>>> DataStax <http://datastax.com/>
>>>>
>>>
>>>
>>
>

Mime
View raw message