While reading we are planning to use a CL of Quorum. So, we are hoping we
will not hit any consistency issues before repair is run.
There will be a chance of getting inconsistencies if less then QUORUM nodes were involved in the load for each row. Assuming RF 3, if you have two adjacent nodes down then you have lost QUOURM. Otherwise you should be ok.

One thing I forgot to mention, you may get some value in increasing the phi_convict_threshold in the yaml or org.apache.cassandra.net:type=FailureDetector MBean, to 16. This will make it harder for a node to be ejected from the cluster, may want to turn it back to 8 to 12 after the bulk load. 

Do you see any
better way of doing this?
In the case with 3 nodes and RF 3, put the a copy of the SSTables on each node and use nodetool refresh. That will almost instantly add them to the nodes. You sill have to handle the issues with down nodes in the same way as using bulk loader.  

Or run 10 million writes into the system. 

Cheers



-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton

On 6/05/2013, at 12:34 PM, praveen.akunuru@wipro.com wrote:

Hi Aaron, Rob,

Thank you for your responses. Sorry about the delay in getting back to
you. To answer your questions:

"Is this a once off data load or something you need to do regularly?"

This will be a regular load. We will have to do a load with 10 million
records once in every 2 hours on our Production Cluster.

3 node cluster with CL 3 is our development environment. Our production
will be a 10 node cluster with CL 3. We will not be able to use
nodetool refresh there. Sorry, I should have been more clear.

At the moment, we are doing the below in the job shell script to work
around this:

1. At the start of the SSTable load job, the script checks if any of the
nodes are down.
2. If any node is down, it runs SSTableloader with '-I' option. If all
nodes are up, it runs the load normally.
3. In case any node goes down during the load and the load job fails, the
script restarts the job from the beginning. This time it will use the -I
option.
4. We are planning to schedule nodetool repair once everyday to handle
these situations.

While reading we are planning to use a CL of Quorum. So, we are hoping we
will not hit any consistency issues before repair is run. Do you see any
better way of doing this?

Thanks & Best Regards,
Praveen





From: aaron morton
<aaron@thelastpickle.com<mailto:aaron@thelastpickle.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>"
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Tuesday, April 30, 2013 1:47 AM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>"
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: How to use Write Consistency 'ANY' with SSTABLELOADER - DSE
Cassandra 1.1.9

One option you have with RF3 and 3 Nodes is to place a copy of all the
SSTables on each node and use nodetool refresh to directly load the
sstables into the node without any streaming.

1. Please can anyone suggest how we can enforce Write Consistency level
when using SSTABLELOADER?
Bulk Loader does not use CL, it's more like a repair / bootstrap.
If you have to skip a node then use repair.

Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 30/04/2013, at 1:05 AM,
praveen.akunuru@wipro.com<mailto:praveen.akunuru@wipro.com> wrote:

Hi All,

We have a requirement to load approximately 10 million records, each
record with approximately 100 columns. We are planning to use the
Bulk-loader program to convert the data into SSTables and then load them
using SSTABLELOADER.

Everything is working fine when all nodes are up and running and the
performance is very good. However, when a node is down, the streaming
fails and the operation stops. We have to run the SSTABLELOADER with
option 'I' to exclude the node that is down. I was wondering if we can
enforce Consistency level of 'ANY' with SSTABLELOADER as well.

We tried specifying the consistency level 'ANY' at Keyspace level.
However, this is not being used by the SSTABLELOADER. It is still looking
for all the nodes to be available.

1. Please can anyone suggest how we can enforce Write Consistency level
when using SSTABLELOADER?

2. Will Sqoop be a good option in these scenarios? Do we have any
performance stats generated while loading data into Cassandra with Sqoop?

Environment:

Cassandra 1.1.9 provided as part of DSE 3.0
3 Nodes
Replication Factor 3
Consistency Level ANY

Regards,
Praveen

Wipro Limited (Company Regn No in UK - FC 019088)
Address: Level 2, West wing, 3 Sheldon Square, London W2 6PS, United
Kingdom. Tel +44 20 7432 8500 Fax: +44 20 7286 5703

VAT Number: 563 1964 27

(Branch of Wipro Limited (Incorporated in India at Bangalore with limited
liability vide Reg no L99999KA1945PLC02800 with Registrar of Companies at
Bangalore, India. Authorized share capital: Rs 5550 mn))

Please do not print this email unless it is absolutely necessary.

The information contained in this electronic message and any attachments
to this message are intended for the exclusive use of the addressee(s) and
may contain proprietary, confidential or privileged information. If you
are not the intended recipient, you should not disseminate, distribute or
copy this e-mail. Please notify the sender immediately and destroy all
copies of this message and any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient
should check this email and any attachments for the presence of viruses.
The company accepts no liability for any damage caused by any virus
transmitted by this email.

www.wipro.com<http://www.wipro.com>



Wipro Limited (Company Regn No in UK FC 019088)
Address: Level 2, West wing, 3 Sheldon Square, London W2 6PS, United Kingdom. Tel +44 20 7432 8500 Fax: +44 20 7286 5703

VAT Number: 563 1964 27

(Branch of Wipro Limited (Incorporated in India at Bangalore with limited liability vide Reg no L99999KA1945PLC02800 with Registrar of Companies at Bangalore, India. Authorized share capital  Rs 5550 mn))

Please do not print this email unless it is absolutely necessary.

The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.

www.wipro.com