cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Hiramoto <k...@hiramoto.org>
Subject Re: rename column family
Date Thu, 10 Feb 2011 22:33:49 GMT
On 02/10/11 22:19, Aaron Morton wrote:
> That should read "Without more information"
>
> A
> On 11 Feb, 2011,at 10:15 AM, Aaron Morton <aaron@thelastpickle.com> wrote:
>
>> With more information I'd say this is not a good idea.
>>
>> I would suggest looking at why you do the table switch in the MySql
>> version and consider if it's still necessary in the Cassandra version.
>>
I do the table switch because it's the fastest way to rebuild an entire
dataset, Say your importing a flat CSV file, you have various cases.

1.  Exact same data loaded, only update timestamp.
2.  new data that was not in previous dataset.
3.  changed data from previous dataset (update)
4.   Data that is not in new data, but is in old.  (delete) Rebuilding
the entire table saves millions of search/delete operations.


In mysql reading/writing the table at the same time  (Many millions of
rows,  many GB of data) slows things down beyond my strict performance
requirements, doing the rename table, makes both the reads/writes much
faster..  Yes, I know this probably doesn't apply to Cassandra.

If Cassandra could do something like the mysql rename it would avoid
having to do the deletes on individual rows, or the repair/compaction of
the column family to remove all the stale data.  Disk space usage is
also very important.   I know after a new import is complete, all the
old data is stale.

>> Could you use prefixes in your keys that the app knows about and
>> switch those?
Yes, but makes the app more complex, and needs to know when the data is
consistent after the import.    I think I would have to do a range scan
to delete all the stale data.

A TTL would be risky as a TTL too high would waste disk space, and stale
data would be around longer than wanted.    A TTL too low would risk not
having data available if a new import should fail, or be delayed.


--
Karl

Mime
View raw message