Thanks Gary. I'll keep an eye on things and see if it happens again. 

From reading the code I'm wondering if there is a small chance of a race condition in HintedHandoffManager.waitForSchemaAgreement() .

Could the following happen? I'm a little unsure on exactly how the endpoint state is removed from the map in Gossiper.

1) node 1 starts
2) Gossiper calls StorageService.onAlive() when the endpoints are detected as alive.
3) HintedHandoffManager.deliverHints() adds a runnable to the HintedHandoff TP
4) This happens several times, and node 1 gets busy delivering hints but there is only 1 thread in the thread pool.
5) Node n is removed from the cluster and the endpoint state is deleted in the Gossiper on node 1 
6) Node 1 gets around to processing the hints for node n and Gossiper.getEndpointStateForEndpoint() returns null for node n


Thanks
Aaron

On 10 Feb, 2011,at 03:03 AM, Gary Dusbabek <gdusbabek@gmail.com> wrote:

Aaron,

It looks like you're experiencing a side-effect of CASSANDRA-2083.
There was at least one place (when node B received updated schema from
node A) where gossip was not being updated with the correct schema
even though DatabaseDescriptor had the right version. I'm pretty sure
this is what you're seeing.

Gary.


On Wed, Feb 9, 2011 at 00:08, Aaron Morton <aaron@thelastpickle.com> wrote:
> I noticed this after I upgraded one node in a 0.7 cluster of 5 to the latest
> stable 0.7 build "2011-02-08_20-41-25" (upgraded  node was jb-cass1 below).
> This is a long email, you can jump to the end and help me out by checking
> something on your  07 cluster.
> This is the value from o.a.c.gms.FailureDetector.AllEndpointStates on
> jb-cass05 9114.67)
> /192.168.114.63   X3:2011-02-08_20-41-25
> SCHEMA:2f555eb0-3332-11e0-9e8d-c4f8bbf76455   LOAD:2.84182972E8
> STATUS:NORMAL,0
> /192.168.114.64   SCHEMA:2f555eb0-3332-11e0-9e8d-c4f8bbf76455
> LOAD:2.84354156E8   STATUS:NORMAL,34028236692093846346337460743176821145
> /192.168.114.66   SCHEMA:075cbd1f-3316-11e0-9e8d-c4f8bbf76455
> LOAD:2.59171601E8   STATUS:NORMAL,102084710076281539039012382229530463435
> /192.168.114.65   SCHEMA:075cbd1f-3316-11e0-9e8d-c4f8bbf76455
> LOAD:2.70907168E8   STATUS:NORMAL,68056473384187692692674921486353642290
> jb08.wetafx.co.nz/192.168.114.67
> SCHEMA:075cbd1f-3316-11e0-9e8d-c4f8bbf76455   LOAD:1.155260665E9
> STATUS:NORMAL,136112946768375385385349842972707284580
> Notice the schema for nodes 63 and 64 starts with 2f55 and for 65, 66 and 67
> it starts with 075.
> This is the output from pycassa calling describe_versions when connected to
> both the 63 (jb-cass1) and 67 (jb-cass5) nodes
> In [34]: sys.describe_schema_versions()
> Out[34]:
> {'2f555eb0-3332-11e0-9e8d-c4f8bbf76455': ['192.168.114.63',
>                                           '192.168.114.64',
>                                           '192.168.114.65',
>                                           '192.168.114.66',
>                                           '192.168.114.67']}
> It's reporting all nodes on the 2f55 schema. The SchemaCheckVerbHandler is
> getting the value from DatabaseDescriptor. FailureDetector MBean is getting
> them from Gossiper.endpointStateMap . Requests are working though, so the
> CFid's must be matching up.
> Commit https://github.com/apache/cassandra/commit/ecbd71f6b4bb004d26e585ca8a7e642436a5c1a4 added
> code to the 0.7 branch in the HintedHandOffManager to check the schema
> versions of nodes it has hints for. This is now failing on the new node as
> follows...
> ERROR [HintedHandoff:1] 2011-02-09 16:11:23,559 AbstractCassandraDaemon.java
> (line
> org.apache.cassandra.service.AbstractCassandraDaemon$1.uncaughtException(AbstractCassandraDaemon.java:114))
> Fatal exception in thread Thread[HintedHandoff:1,1,main]
> java.lang.RuntimeException: java.lang.RuntimeException: Could not reach
> schema agreement with /192.168.114.64 in 60000ms
>         at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.RuntimeException: Could not reach schema agreement with
> /192.168114.64 in 60000ms
>         at
> org.apache.cassandra.db.HintedHandOffManager.waitForSchemaAgreement(HintedHandOffManager.java:256)
>         at
> org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:267)
>         at
> org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffManager.java:88)
>         at
> org.apachecassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOffManager.java:391)
>         at
> org.apachecassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         ... 3 more
> (the nodes can all see each other, checked with notetool during the 60
> seconds)
> If I restart one of the nodes with the 075 schema (without upgrading it) it
> reads the schema from the system tables and goes back to the 2f55 schema.
> e.g. the 64 node was also on the 075 schema, I restarted and it moved to the
> 2f55 and logged appropriately. While writing this email I checked again with
> the 65 node, and the schema if was reporting to other nodes changed after a
> restart from 075 to 2f55
> INFO [main] 2011-02-09 17:17:11,457 DatabaseDescriptor.java (line
> org.apache.cassandra.config.DatabaseDescriptor) Loading schema version
> 2f555eb0-3332-11e0-9e8d-c4f8bbf76455
> I've been reading the code for migrations and gossip don't have a theory as
> to what is going on.
>
> REQUEST FOR HELP:
> If you have a 0.7 cluster can you please check if this has happened so I can
> know this is a real problem or just an Aaron problem. You can check by...
> - getting the values from the o.a.c.gms.FailureDetector.AllEndPointStates
> - running describe_schema_versions via the API, here is how to do it via
> pycassa http://pycassa.github.com/pycassa/api/pycassa/system_manager.html?highlight=describe_schema_versions
> - checking at the schema ids' from the failure detector match the result
> from describe_schema_versions()
> - if they do not match can you also include some info on what sort of schema
> changes have happened on the box.
> Thanks
> Aaron
>