cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "graham sanderson (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-7734) Schema pushes (seemingly) randomly not happening
Date Sun, 10 Aug 2014 19:56:11 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-7734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

graham sanderson updated CASSANDRA-7734:
----------------------------------------

    Description: 
We have been seeing problems since upgrade to 2.0.9 from 2.0.5.

Basically after a while, new schema changes (we periodically add tables) start propagating
very slowly to some nodes and fast to others. It looks from the logs and trace that in this
case the "push" of the schema never happens (note a node has decided not to push to another
node, it doesn't seem to start again) from the originating node to some of the other nodes.
In this case though, we do see the other node end up pulling the schema some time later when
it notices its schema is out of date.

Here is code from 2.0.9 MigrationManager.announce

{code}
       for (InetAddress endpoint : Gossiper.instance.getLiveMembers())
        {
            // only push schema to nodes with known and equal versions
            if (!endpoint.equals(FBUtilities.getBroadcastAddress()) &&
                    MessagingService.instance().knowsVersion(endpoint) &&
                    MessagingService.instance().getRawVersion(endpoint) == MessagingService.current_version)
                pushSchemaMutation(endpoint, schema);
        }
{code}

and from 2.0.5

{code}
        for (InetAddress endpoint : Gossiper.instance.getLiveMembers())
        {
            if (endpoint.equals(FBUtilities.getBroadcastAddress()))
                continue; // we've dealt with localhost already

            // don't send schema to the nodes with the versions older than current major
            if (MessagingService.instance().getVersion(endpoint) < MessagingService.current_version)
                continue;

            pushSchemaMutation(endpoint, schema);
	}
{code}

the old getVersion() call would return MessagingService.current_version if the version was
unknown, so the push would occur in this case. I don't have logging to prove this, but have
strong suspicion that the version may end up null in some cases (which would have allowed
schema propagation in 2.0.5, but not by somewhere after that and <= 2.0.9)



  was:
We have been seeing problems since upgrade to 2.0.9 from 2.0.5.

Basically after a while, schema changes start propagating slowly from some nodes to others.
It looks from the logs and trace that in this case the "push" of the schema never happens
(note a node has decided not to push to another node, it doesn't seem to start again). In
this case though, we do see the other node end up pulling the request some time later when
it notices its schema is out of date.

Here is code from 2.0.9 MigrationManager.announce

{code}
       for (InetAddress endpoint : Gossiper.instance.getLiveMembers())
        {
            // only push schema to nodes with known and equal versions
            if (!endpoint.equals(FBUtilities.getBroadcastAddress()) &&
                    MessagingService.instance().knowsVersion(endpoint) &&
                    MessagingService.instance().getRawVersion(endpoint) == MessagingService.current_version)
                pushSchemaMutation(endpoint, schema);
        }
{code}

and from 2.0.5

{code}
        for (InetAddress endpoint : Gossiper.instance.getLiveMembers())
        {
            if (endpoint.equals(FBUtilities.getBroadcastAddress()))
                continue; // we've dealt with localhost already

            // don't send schema to the nodes with the versions older than current major
            if (MessagingService.instance().getVersion(endpoint) < MessagingService.current_version)
                continue;

            pushSchemaMutation(endpoint, schema);
	}
{code}

the old getVersion() call would return MessagingService.current_version if the version was
unknown, so the push would occur in this case. I don't have logging to prove this, but have
strong suspicion that the version may end up null in some cases (which would have allowed
schema propagation in 2.0.5, but not by somewhere after that and <= 2.0.9)




> Schema pushes (seemingly) randomly not happening
> ------------------------------------------------
>
>                 Key: CASSANDRA-7734
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7734
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: graham sanderson
>
> We have been seeing problems since upgrade to 2.0.9 from 2.0.5.
> Basically after a while, new schema changes (we periodically add tables) start propagating
very slowly to some nodes and fast to others. It looks from the logs and trace that in this
case the "push" of the schema never happens (note a node has decided not to push to another
node, it doesn't seem to start again) from the originating node to some of the other nodes.
In this case though, we do see the other node end up pulling the schema some time later when
it notices its schema is out of date.
> Here is code from 2.0.9 MigrationManager.announce
> {code}
>        for (InetAddress endpoint : Gossiper.instance.getLiveMembers())
>         {
>             // only push schema to nodes with known and equal versions
>             if (!endpoint.equals(FBUtilities.getBroadcastAddress()) &&
>                     MessagingService.instance().knowsVersion(endpoint) &&
>                     MessagingService.instance().getRawVersion(endpoint) == MessagingService.current_version)
>                 pushSchemaMutation(endpoint, schema);
>         }
> {code}
> and from 2.0.5
> {code}
>         for (InetAddress endpoint : Gossiper.instance.getLiveMembers())
>         {
>             if (endpoint.equals(FBUtilities.getBroadcastAddress()))
>                 continue; // we've dealt with localhost already
>             // don't send schema to the nodes with the versions older than current major
>             if (MessagingService.instance().getVersion(endpoint) < MessagingService.current_version)
>                 continue;
>             pushSchemaMutation(endpoint, schema);
> 	}
> {code}
> the old getVersion() call would return MessagingService.current_version if the version
was unknown, so the push would occur in this case. I don't have logging to prove this, but
have strong suspicion that the version may end up null in some cases (which would have allowed
schema propagation in 2.0.5, but not by somewhere after that and <= 2.0.9)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message