Option #3 since it depends on the placement strategy and not the partitioner.

-Bryan



On Mon, May 20, 2013 at 6:24 AM, Pinak Pani <nishant.has.a.question@gmail.com> wrote:
I just wanted to verify the fact that if I happen to setup a multi data-center Cassandra setup, will each data center have the complete data-set with it?

Say, I have two data-center each with two nodes, and a partitioner that ranges from 0 to 100. Initial token assigned this way

DC1:N1 = 00
DC2:N1 = 25
DC1:N2 = 50
DC2:N2 = 75

where DCX is data center X, NX is node X. Which one the following options is true?

Option #1: DC1 and DC2, each will hold complete dataset with keys bucketed as follows
DC1:N1 = (50, 00] => 50 keys
DC1:N2 = (00, 50] => 50 keys
----
Complete data set mirrored at DC1

DC2:N1 = (75, 25] => 50 keys
DC2:N2 = (25, 75] => 50 keys
----
Complete data set mirrored at DC2

Option #2: DC1 and DC2, each will hold 50% of the data with keys bucketed as follows (much the same way in a single C setup)
DC1:N1 = (75, 00] => 25 keys
DC2:N1 = (00, 25] => 25 keys
DC1:N2 = (25, 50] => 25 keys
DC2:N2 = (50, 75] => 25 keys
----
data is divided into the two data centers.

Thanks,
PP