cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From xutom <>
Subject Re:Re: Fail to export all data in C* cluster
Date Tue, 05 Jan 2016 07:43:07 GMT
Dear Jack,
    My keyspace is such as:
test@cqlsh> DESC KEYSPACE sky ;
CREATE KEYSPACE sky WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'}
 AND durable_writes = true;
CREATE TABLE sky.user1 (pati int, uuid text, name text,  name2 text,
    PRIMARY KEY (pati, uuid))

    Now I am using CL=ALL during inserting and set the retry policy by such following codes:
RetryPolicy rp = new CustomRetryPolicy(3, 3, 2);
Cluster cluster = Cluster.builder().addContactPoint(seedIp).withCredentials(
                "test", "test")
                        new TokenAwarePolicy(new DCAwareRoundRobinPolicy()))  /*Here My Cluster
has 6 nodes, all in the same DataCenter*/

PreparedStatement insertStatement = session
                .prepare("INSERT INTO  " + tableName
                        + "(" + columns + ") "
                        + "VALUES (?, ?, ?, ?);");
insertStatement.setConsistencyLevel(ConsistencyLevel.ALL); /* Here I set the CL to ALL */
    I start 30 threads to insert datas, each thread uses BatchStatement to insert 100 rows
with the same partition key but different primary key every time, and run 100000 times. After
Inserting about 99084500 rows into C* cluster with many timeout exceptions I stop the inserting
progress and then use following codes to export all the datas into local file:
String cqlstr = " select * from " + this.tableName
                                + " where pati = " + this.partitionssss[i];
PreparedStatement Statement = session
                            BoundStatement bStatement = new BoundStatement(
iter = session.execute(bStatement).iterator();
then I write the results( in iter) to localfile. I run  3 times and all three results are

    I have set the CL to ALL when inserting datas, but why I get the different results when
I export all datas everytime?
    By the way, I have set :  "hinted_handoff_enabled: true", is it the problem when the C*
cluster is overloaded even if I have set the CL to ALL?

Best Regrads

At 2016-01-04 23:37:20, "Jack Krupansky" <> wrote:

You have three choices:

1. Insert with CL=ALL, with client-level retries if the write fails due to the cluster being
2. Insert with CL=QUORUM and then run repair after all data has been inserted.
3. Lower your insert rate in your client so that the cluster can keep up with your inserts.

Yes, Cassandra supports eventual consistency, but if you overload the cluster, the hinted
handoff for nodes beyond the requested CL may timeout and be discarded, hence the need for

What CL are you currently using for inserts?

-- Jack Krupansky

On Mon, Jan 4, 2016 at 9:52 AM, xutom <> wrote:

Hi all,

    I have a C* cluster with 6 nodes. My cassandra version is 2.1.1. I start 50 threads to
insert datas into C* cluster, each thread inserts about up to 100 million rows with the same
partition key. After inserting all the datas, I start another app with 50 threads to export
all the datas into localfile, I using such cqlsh: select * from table where partition_id=xxx(each
partition has about 100 million rows). But unfortunately I fail to export all the datas: I
run 3 times, and each time I get the different number of results. If I successfully export
all datas, everytime I should get the same number of results, is it right?

Best Regards


View raw message