cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben Slater <ben.sla...@instaclustr.com>
Subject Re: Question about replica and replication factor
Date Tue, 20 Sep 2016 04:57:43 GMT
If your read operation requires data from multiple partitions and the
partitions are spread across multiple nodes then the coordinator has the
job of contacting the multiple nodes to get the data and return to the
client. So, in your scenario, if you did a select * from table (with no
where clause) the coordinator would need to contact and execute a read on
at least one other node to satisfy the query.

Cheers
Ben

On Tue, 20 Sep 2016 at 14:50 Jun Wu <wuxiaomin98@hotmail.com> wrote:

> Hi Ben,
>
>     Thanks for the quick response.
>
>     It's clear about the example for single row/partition. However,
> normally data are not single row. Then for this case, I'm still confused.
> http://docs.datastax.com/en/cassandra/2.1/cassandra/dml/architectureClientRequestsRead_c.html
>
>     The link above gives an example of 10 nodes cluster with RF = 3. But
> the figure and the words in the post shows that the coordinator only
> contact/read data from one replica, and operate read repair for the left
> replicas.
>
>     Also, how could read accross all nodes in the cluster?
>
>     Thanks!
>
> Jun
>
>
> From: ben.slater@instaclustr.com
> Date: Tue, 20 Sep 2016 04:18:59 +0000
> Subject: Re: Question about replica and replication factor
> To: user@cassandra.apache.org
>
>
> Each individual read (where a read is a single row or single partition)
> will read from one node (ignoring read repairs) as each partition will be
> contained entirely on a single node. To read the full set of data,  reads
> would hit at least two nodes (in practice, reads would likely end up being
> distributed across all the nodes in your cluster).
>
> Cheers
> Ben
>
> On Tue, 20 Sep 2016 at 14:09 Jun Wu <wuxiaomin98@hotmail.com> wrote:
>
> Hi there,
>
>     I have a question about the replica and replication factor.
>
>     For example, I have a cluster of 6 nodes in the same data center.
> Replication factor RF is set to 3  and the consistency level is default 1.
> According to this calculator http://www.ecyrd.com/cassandracalculator/,
> every node will store 50% of the data.
>
>     When I want to read all data from the cluster, how many nodes should I
> read from, 2 or 1? Is it 2, because each node has half data? But in the
> calculator it show 1: You are really reading from 1 node every time.
>
>    Any suggestions? Thanks!
>
> Jun
>
> --
> ————————
> Ben Slater
> Chief Product Officer
> Instaclustr: Cassandra + Spark - Managed | Consulting | Support
> +61 437 929 798
>
-- 
————————
Ben Slater
Chief Product Officer
Instaclustr: Cassandra + Spark - Managed | Consulting | Support
+61 437 929 798

Mime
View raw message