incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Masood Mortazavi <masoodmortaz...@gmail.com>
Subject Re: Reviewing . . . RackAwareStrategy.java . . . ( rev 954657 )
Date Mon, 21 Jun 2010 22:17:56 GMT
Here's a summary of my earlier comments:

In a more flexible architecture, one must be able to provide or plug-in
configuration elements along with one's replica placement strategy plug-in,
in order to extend or complement the availability semantics along with one's
plug-in code.

As the base of the plug-in, Cassandra can provide the iterators of
tokens/nodes, that the strategy must use to walk to find the "place" of
replicas for a given token/node. Cassandra will also need to provide a
handle to "private" configuration files, to the strategy code.

- m.



On Tue, Jun 15, 2010 at 12:47 PM, Masood Mortazavi <
masoodmortazavi@gmail.com> wrote:

> On Tue, Jun 15, 2010 at 10:55 AM, Jonathan Ellis <jbellis@gmail.com>wrote:
>
>> On Mon, Jun 14, 2010 at 11:03 PM, Masood Mortazavi
>> <masoodmortazavi@gmail.com> wrote:
>> > The comment on the top of RackAwareStrategy says:
>>
>> You are correct.  RAS sort of works under other conditions but it is
>> primarily intended for 2 DCs and RF=3.  I will update the comment in
>> question.
>>
>
>
> An orthogonal but related problem is the following . . .
>
> Currently, each replica placement strategy involves its own configuration
> extensions, along with a great deal of repeated and intertwined code among
> the strategies. (For example, all "strategies" currently need to iterate
> through nodes. This is a common funcationality.)
>
> The current approach not only affects construction of replica placement
> strategies but also complicates their semantics.
>
> It may be possible to refactor the code as follows:
>
> (1) Each node has a set of properties assigned to it through the
> configuration (right now, in the trunk, those properties are the "rack" and
> "DC" position of a node but it should be possible to add any number of other
> properties, and they should really all be in the same configuration file,
> not separated as they are, today, in two or more separate files).
>
> (2) Once these physical properties are assigned/defined for each node, a
> pluggablity architecture would allow whoever extends the node properties, to
> plug-in a node "Examiner" as a complement to any additional properties.
>
> (3) In the iteration that's common to all replica placement search logic,
> the "Examiner" will either "pass" or "fail" an (iterated) node as a replica
> place for a given primary based on the properties of that node.
>
> Although such refactoring is not entirely trivial, it will lead to less
> repetition across "strategies", better factoring of concerns and more
> reliable code, I believe.
>
> It will also make maintenance and extension of strategies much easier . . .
>
>
>
>
>>
>> > There are other issues to think about. For example, for quorum write
>> > (consistency.quorum) to work faster, shouldn't the first replicas be as
>> > close as possible (i.e. on the same rack)?  The whole point of choosing
>> this
>> > level of consistency is to improve performance. Right?
>>
>> No, the point is to improve reliability (there are a number of failure
>> scenarios that will result in losing an entire rack at once).
>>
>
>
> Yes, I understand that.
>
> What I was trying to say is that, if we agree to the above, we should
> select the other-DC and other-Rack replica after we have selected all "near"
> replicas.
>
> (I imagine that, during actual replication, the replica placement list is
> iterated sequentially and taht the first replica will have to be the nearest
> and then the farther and farther replicas are chosen and put on the list.)
>
> Thanks,
> - m.
>
>
>
>>
>> --
>>
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of Riptano, the source for professional Cassandra support
>> http://riptano.com
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message