incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Schemas diverging while dynamically creating CF.
Date Sat, 16 Apr 2011 23:47:23 GMT
There is a known issue for concurrent schema migrations https://issues.apache.org/jira/browse/CASSANDRA-1391

Once they diverge the I think you can delete the schema by removing the necessary system files
and leaving the data files in place, then re-creating the files. 

And yes, you should not be creating lots of column families they are not the same as tables.


Aaron

On 16 Apr 2011, at 09:13, Alejandro Perez wrote:

> Thanks for the quick response!. I will reconsider the schema.
> 
> However, the problem troubles me somehow. How are schema changes supposed to be done?
Should I serialize them, should I halt other cluster operations while I do the schema change?
Is this a known problem with cassandra?
> 
> The other question, and I think the more important one for me now: how do I repair the
cluster without loosing data once the schemas diverge? Right now the only way I have is erase
all data and have the cluster start empty. Should this problem ever happen in production,
it's important there's a way to recover the data.
> 
> On Fri, Apr 15, 2011 at 1:57 PM, Dan Hendry <dan.hendry.junk@gmail.com> wrote:
> Uh... don’t create a column family per user. Column families are meant to be fairly
static; conceptually equivalent to a table in a relational database. Why do you need (or even
want) a CF per user? Reconsider your data model, a single column family with an inverted index
for a ‘user’ column is probably more what you are looking for. Operationally, the fewer
CFs the better.
> 
>  
> Dan
> 
>  
> From: Alejandro Perez [mailto:spike@indextank.com] 
> Sent: April-15-11 16:39
> To: user@cassandra.apache.org
> Cc: Support
> Subject: Schemas diverging while dynamically creating CF.
> 
>  
> Hello,
> 
>  
> We're testing cassandra for integration with indextank. In this first try, we're creating
one column family for each user. In practice, on the first run and for the first few documents
(a few 100s), a new CF is created, and a document is immediately added to it. A few (up to
50) requests of this type are issued in parallel (for different column families).
> 
>  
> The end result, and quite repeatable, is having the cluster split with different schema
versions, and they never agree.
> 
>  
> Any thoughts?
> 
>  
>  
> Thanks,
> 
>  
> Spike.
> 
> 
> --
> 
> Alejandro Perez
> IndexTank
> 
> follow us @indextank | read our blog | subscribe our user mailing list
> 
> 
> 
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 9.0.894 / Virus Database: 271.1.1/3574 - Release Date: 04/15/11 02:34:00
> 
> 
> 
> 
> -- 
> Alejandro Perez
> IndexTank
> 
> follow us @indextank | read our blog | subscribe our user mailing list
> 
> 


Mime
View raw message