cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stu Hood (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CASSANDRA-981) Dynamic endpoint snitch
Date Wed, 05 May 2010 22:04:07 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12864550#action_12864550
] 

Stu Hood commented on CASSANDRA-981:
------------------------------------

> but that is actually solving a different problem.
I think the key takeaway from Vivaldi is using a coordinate system, so that you don't have
to store latency information for every endpoint you've ever communicated with.

> I'm sure you thought about this, but in cases where you're using the RackAwareStrategy,
the dynamic snitch will need to yield consistent results
Would it be possible to have the snitch store its coordinates in the system table, so that
during bootstrapping, it looks at gossip latency, tunes its coordinates, and then persists
them forever? It rules out using endpointsnitches to dynamically adjust position based on
load on a machine, but I think that might be better solved by load balancing anyway.


> Dynamic endpoint snitch
> -----------------------
>
>                 Key: CASSANDRA-981
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-981
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>             Fix For: 0.7
>
>
> An endpoint snitch that automatically and dynamically infers "distance" to other machines
without having to explicitly configure rack and datacenter positions solves two problems:
> The killer feature here is adapting to things like compaction or a failing-but-not-yet-dead
disk.  This is important, since when we are doing reads we pick the "closest" replica for
actually reading data from (and only read md5s from other replicas).  This means that if the
closest replica by network topology is temporarily slow due to compaction (for instance),
we'll have to block for its reply even if we get the other replies much much faster.
> Not having to manually re-sync your configuration with your network topology when changes
(adding machines) are made is a nice bonus.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message