incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Many creation/inserts in parallel
Date Mon, 29 Apr 2013 20:00:34 GMT
> About 80% of these CFs should be truncated every day and if we decrease many CF by creating
one key field in one CF, a huge amount of tombstones will appear.
> 
> 
Truncation requires that all nodes be available, so if you are doing it each day you may run
into troubles if a node it down. 

Have you looked at using the column TTL setting? You could set this to 24 hours, cassandra
would first not return the data and then purge it using compaction after TTL + gc_grace_seconds
has passed. 

> 2) Tables appear with delay. Driver switches connections by Round-Robin. I think that
CF was created in one node and after a moment the data was inserted in another node. And schema
doesn't have time to synchronize.
> 
> 
I *think* that is possible. Schema migrations are not ack'd on all nodes before returning
to the client. Can your try adding a delay before using the schema?

Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 29/04/2013, at 7:33 PM, Sasha Yanushkevich <yanush51@gmail.com> wrote:

> 1) We’ve tested 100 threads in parallel and each thread created 10 tables. I think
we will change our data model, but another problem may occur. About 80% of these CFs should
be truncated every day and if we decrease many CF by creating one key field in one CF, a huge
amount of tombstones will appear. What do think about it?
> 2) Tables appear with delay. Driver switches connections by Round-Robin. I think that
CF was created in one node and after a moment the data was inserted in another node. And schema
doesn't have time to synchronize.
> 
> 
> 
> 2013/4/28 aaron morton <aaron@thelastpickle.com>
>> At first many CF are being created in parallel (about 1000 CF).
>> 
>> 
> Can you explain this in a bit more detail ? By in parallel do you mean multiple threads
creating CF's at the same time ?
> 
> I would also recommend taking a second look at your data model, you probably do not want
to create so many CF's. 
> 
> 
>>  During tests we're receiving some exceptions from driver, e.g.:
>> 
>> 
> 
> The CF you are trying to read / write from does not exist. Check if the table exists
using cqlsh / cassandra-cli. 
> 
> Check your code to make sure it was created. 
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 26/04/2013, at 10:49 PM, Sasha Yanushkevich <yanush51@gmail.com> wrote:
> 
>> Hi All
>> 
>> We are testing Cassandra 1.2.3 (3 nodes with RF:2) with FluentCassandra driver. At
first many CF are being created in parallel (about 1000 CF). After creation is done follows
many insertions of little amount of data into the DB. During tests we're receiving some exceptions
from driver, e.g.:
>> 
>> FluentCassandra.Operations.CassandraOperationException: unconfigured columnfamily
table_78_9
>> and
>> FluentCassandra.Operations.CassandraOperationException: Connection to Cassandra has
timed out
>> 
>> Though in Cassandra's logs there are no exceptions.
>> 
>> What should we do to handle these exceptions?
>> 
>> -- 
>> Best regards,
>> Alexander
> 
> 
> 
> 
> -- 
> Best regards,
> Alexander


Mime
View raw message