cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10122) AssertionError after upgrade to 3.0
Date Fri, 04 Dec 2015 14:16:11 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15041593#comment-15041593
] 

Sylvain Lebresne commented on CASSANDRA-10122:
----------------------------------------------

What the trace (which is btw different from the initial trace the ticket was created for but
lets ignore that) is telling us is that {{ReadCommand.LegacyReadCommandSerializer.serializedSize}}
is called with a {{version < MessagingService.VERSION_30}}. Except that the only time a
{{ReadCommand.LegacyReadCommandSerializer}} is even created is in {{SinglePartitionReadCommand.createMessage}}
with this:
{noformat}
return new MessageOut<>(MessagingService.Verb.READ, this, version < MessagingService.VERSION_30
? legacyReadCommandSerializer : serializer);
{noformat}
so we shouldn't get that error. This suggest that the code somehow doesn't always pass the
same version when creating a message and later serializing it, and indeed, in {{AbstractReadExecutor.makeRequests}},
we pass the version for the message creation, but we reuse the message for all endpoints without
checking that the version is the same for all of them, which is obviously wrong. So I've pushed
a simple fix for that [here|https://github.com/pcmanus/cassandra/commits/10122] (we potentially
re-create the same message a few times, but it's pretty cheap and the number of replica is
very small, so it's not worth micro-optimizing imo).

It's pretty obvious from the stack that this is the problem but I haven't validated that it
fixes the {{upgrade_through_versions_test}} because running that test locally is ... (deep
breathing) ... problematic. Because:
* you can't pass the local branch you want to the test, it expects a version number and build
the branch name from that. That means I'd presumably have to commit work-in-progress stuff
to my local {{cassandra-3.0}} branch to make the test happy, but that's quite error-prone.
* it then errors out basically because I haven't exported {{CASSANDRA_DIR}}. Easily fixed,
but it would be great if the test could reuse the general dtest mechanism for that (which
checks {{~/.cassandra_dtest}} for that).
* it then errors out with
{noformat}
ERROR: Failure: ValueError (too many values to unpack)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/nose/loader.py", line 420, in loadTestsFromName
    addr.filename, addr.module)
  File "/usr/lib/python2.7/dist-packages/nose/importer.py", line 47, in importFromPath
    return self.importFromDir(dir_path, fqname)
  File "/usr/lib/python2.7/dist-packages/nose/importer.py", line 94, in importFromDir
    mod = load_module(part_fqname, fh, filename, desc)
  File "/home/pcmanus/Git/cassandra-dtest/upgrade_through_versions_test.py", line 63, in <module>
    _, ref_type, ref = _fullref.split('/')
ValueError: too many values to unpack
{noformat}
That's because it iterates over all the repo git refs and expect all of them to be of the
form {{x/y/z}}. Sadly I had some that are {{x/y/z/w}}.
* once you've fixed that, it errors out with:
{noformat}
ERROR: rolling_upgrade_test (upgrade_through_versions_test.TestUpgrade_from_cassandra_2_1_HEAD_to_cassandra_3_0_HEAD)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/pcmanus/Git/cassandra-dtest/upgrade_through_versions_test.py", line 904, in
setUp
    switch_jdks(os.environ['CASSANDRA_VERSION'])
  File "/usr/lib/python2.7/UserDict.py", line 23, in __getitem__
    raise KeyError(key)
KeyError: 'CASSANDRA_VERSION'
-------------------- >> begin captured logging << --------------------
{noformat}
that is, it complains you haven't exported {{CASSANDRA_VERSION}}, which is ironic cause the
test has at the beginning the following lines:
{noformat}
if os.environ.get('CASSANDRA_VERSION'):
    debug('CASSANDRA_VERSION is not used by upgrade tests!')
{noformat}
which actually make sense: if you upgrade through version, what version are you supposed to
set. I ended up commenting all calls to {{switch_jdks}} as it's not need for upgrading from
2.1, but it seems that method would also require you to manually set {{JAVA7_HOME}} and {{JAVA8_HOME}}.
It would be great if it wasn't necessary to set up tons of environment variable to have the
test running.
* with that fixed, the test appears to actually start running, but it quickly errors out with
a bunch of error messages like
{noformat}
ServerError: <ErrorMessage code=0000 [Server error] message="java.lang.NoClassDefFoundError:
org/apache/cassandra/service/RowDataResolver">
{noformat}
which suggest something is wrong with how branches are compiled.

I gave up at that point, and maybe I was particularly unlucky/messed up something, but it
would be great if we could it more friendly to run this test locally (and having it work)
([~rhatch]).

> AssertionError after upgrade to 3.0
> -----------------------------------
>
>                 Key: CASSANDRA-10122
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10122
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Russ Hatch
>            Assignee: Sylvain Lebresne
>             Fix For: 3.0.1, 3.1
>
>         Attachments: node1.log, node2.log, node3.log
>
>
> Upgrade tests are encountering this exception after upgrade from 2.2 HEAD to 3.0 HEAD:
> {noformat}
> ERROR [SharedPool-Worker-4] 2015-08-18 12:33:57,858 Message.java:611 - Unexpected exception
during request; channel = [id: 0xa5ba2c7a, /127.0.0.1:55048 => /127.0.0.1:9042]
> java.lang.AssertionError: null
>         at org.apache.cassandra.db.ReadCommand$Serializer.serializedSize(ReadCommand.java:520)
~[main/:na]
>         at org.apache.cassandra.db.ReadCommand$Serializer.serializedSize(ReadCommand.java:461)
~[main/:na]
>         at org.apache.cassandra.net.MessageOut.payloadSize(MessageOut.java:166) ~[main/:na]
>         at org.apache.cassandra.net.OutboundTcpConnectionPool.getConnection(OutboundTcpConnectionPool.java:72)
~[main/:na]
>         at org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:583)
~[main/:na]
>         at org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:733)
~[main/:na]
>         at org.apache.cassandra.net.MessagingService.sendRR(MessagingService.java:676)
~[main/:na]
>         at org.apache.cassandra.net.MessagingService.sendRRWithFailure(MessagingService.java:659)
~[main/:na]
>         at org.apache.cassandra.service.AbstractReadExecutor.makeRequests(AbstractReadExecutor.java:103)
~[main/:na]
>         at org.apache.cassandra.service.AbstractReadExecutor.makeDataRequests(AbstractReadExecutor.java:76)
~[main/:na]
>         at org.apache.cassandra.service.AbstractReadExecutor$AlwaysSpeculatingReadExecutor.executeAsync(AbstractReadExecutor.java:323)
~[main/:na]
>         at org.apache.cassandra.service.StorageProxy$SinglePartitionReadLifecycle.doInitialQueries(StorageProxy.java:1599)
~[main/:na]
>         at org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1554)
~[main/:na]
>         at org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1501)
~[main/:na]
>         at org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1420) ~[main/:na]
>         at org.apache.cassandra.db.SinglePartitionReadCommand$Group.execute(SinglePartitionReadCommand.java:457)
~[main/:na]
>         at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:232)
~[main/:na]
>         at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:202)
~[main/:na]
>         at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:72)
~[main/:na]
>         at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:204)
~[main/:na]
>         at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:470)
~[main/:na]
>         at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:447)
~[main/:na]
>         at org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:139)
~[main/:na]
>         at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:507)
[main/:na]
>         at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:401)
[main/:na]
>         at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
[netty-all-4.0.23.Final.jar:4.0.23.Final]
>         at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
[netty-all-4.0.23.Final.jar:4.0.23.Final]
>         at io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
[netty-all-4.0.23.Final.jar:4.0.23.Final]
>         at io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
[netty-all-4.0.23.Final.jar:4.0.23.Final]
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_45]
>         at org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
[main/:na]
>         at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) [main/:na]
>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
> {noformat}
> This occurs while the cluster is in a mixed version state, with the first node upgraded
to 3.0, and the remaining two nodes still on 2.2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message