incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emalayan Vairavanathan <svemala...@yahoo.com>
Subject Re: Creating namespace and column family from multiple nodes concurrently
Date Fri, 24 May 2013 05:14:27 GMT
I am sorry if I was not clear. I was using nodes to refer machines (or vice versa).

Let me put in another way... 

The application is composed of multiple instances of an executable. The application runs on
multiple machines concurrently. All the instances are going to issue the same CQL command
to and try to create exactly same namespace and column families.

Thank you
Emalayan


________________________________
 From: Arthur Zubarev <Arthur.Zubarev@Aol.com>
To: Emalayan Vairavanathan <svemalayan@yahoo.com>; user@cassandra.apache.org 
Sent: Thursday, 23 May 2013 1:15 PM
Subject: Re: Creating namespace and column family from multiple nodes concurrently
 


so where the multiple nodes are? I am just puzzled  
From: Emalayan Vairavanathan 
Sent: Thursday, May 23, 2013 3:43 PM
To: Arthur Zubarev ; user@cassandra.apache.org 
Subject: Re: Creating namespace and column family from multiple 
nodes concurrently
  "Would 
each device/machine have its own keyspace?"
 
No. 
All the machines are going to run the exactly same CQL commands and going to 
create the same namespace and column families.
 
Thank 
you
Emalayan
 

________________________________
 From: Arthur Zubarev <Arthur.Zubarev@Aol.com>
To: Emalayan Vairavanathan 
<svemalayan@yahoo.com>; user@cassandra.apache.org 
Sent: Thursday, 23 May 2013 12:20 
PM
Subject: Re: Creating 
namespace and column family from multiple nodes concurrently

 
Would each device/machine have its own keyspace?
 
Basically, your client needs to take care of a successful creation of the 
schema and any other verifications and it is going to be time consuming.  
From: Emalayan Vairavanathan 
Sent: Thursday, May 23, 2013 3:07 PM
To: user@cassandra.apache.org 
Subject: Re: Creating namespace and column family from multiple 
nodes concurrently
  Hi Arthur and Farraz,

Thank 
you for getting back to me.

I 
am trying to avoid sync among concurrent instances and thisis why I am preferring Option -
2. Further in my application, I have 
reasonable window between the application initialization phase and the 
application runtime.  So as long as Cassandra can safely handle concurrent 
creation I should be fine.

Do you have any idea how Cassandra is 
going to handle concurrent namespace and column family creation (Here all the 
instances are going to create the same namespace and column families 
concurrently)? 
        
- Does Cassandra take much time to agree on a final schema (In case if Cassandra 
is using some sort of exponential back off algorithms to handle schema 
conflicts) ? 
        
- Or is it going to result schema conflicts which needs manual intervention 
?
        
- Or will this result in race conditions ?
        
- Or some other issues e.g: memory/ cpu /network bottlenecks ?  

Thank you
Emalayan
 

________________________________
 From: Arthur Zubarev <arthur.zubarev@aol.com>
To: user@cassandra.apache.org; 
svemalayan@yahoo.com 
Sent: Wednesday, 22 May 2013 8:07 PM
Subject: Re: Creating namespace and column 
family from multiple nodes concurrently

 
I am 
assuming here you want to sync all the 100s of nodes once the application is 
airborne. I suspect this would flood the network and even potentially affect the 
machine itself memory-wise. How are you going to maintain the nodes 
(compaction+repair)? 
 
Regards,

Arthur


 
 
-----Original 
Message-----
From: Emalayan Vairavanathan <svemalayan@yahoo.com>
To: 
user <user@cassandra.apache.org>
Sent: Wed, May 22, 2013 8:31 
pm
Subject: Creating namespace and column family from multiple nodes 
concurrently


Hi all,
 
I 
am implementing a distributed application which runs on 100s of machines 
concurrently. This application is going to use Cassandra as underlaying 
storage.
 
The 
application creates the schema 
(name space and column families) during initialization phase.  It seems I have two options

to create the schema.

Option - 1 : 
Using a single node for schema creation.
        
Option - 2: Having all the nodes (> 100) to run the same schema creation 
logic (First, nodes will check whether the schema is already available and then 
try to create the schema if it is not available already).  
 
To 
keep the initialization phase simple, I prefer to go for Option - 2. However I 
am not sure how Cassandra is going to behave if multiple nodes try to create the 
same schema (namespace and column families) concurrently. It would be nice if 
someone can tell me about the implications of Option - 2 with Cassandra version 
1.2.2.

Please let me know if you have 
question.

Thank you
VE
Mime
View raw message