incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Morton <aa...@thelastpickle.com>
Subject Re: rename column family
Date Mon, 14 Feb 2011 01:32:19 GMT
Forgot to put on the end of this, you could take that approach but it's not what CF's are designed
for. Delete's are relatively cheap compared to MySql etc because most of the work is done
in the compaction. 

My first approach would be to use row keys with prefixes, switch at the application level,
run deletions jobs in the background and tune GCGraceSeconds and repairs to manage disk space. 

Depending on what your data is, you may find the ingest process is easier. e.g. 
- there is no update call in cassandra, so you may be able to do no look inserts
- overwriting existing data with the same value will mean it is stored twice, but this will
be removed during compactions
- you can issue a no look delete for a row, if the row is not there there will be some overhead
(tombstone will be kept). Again during compaction it will be removed. 

Also, if you know the maximum timestamp of the previous load job you could try this trick
to do a bulk delete http://www.mail-archive.com/user@cassandra.apache.org/msg09576.html

Again, data is cleaned up as part of compaction. 

You mileage may vary. 

Aaron



 
On 14 Feb, 2011,at 08:40 AM, Aaron Morton <aaron@thelastpickle.com> wrote:

There are functions on the Cassandra API to rename and drop column families, see 
http://wiki.apache.org/cassandra/API dropping a CF does not immediately free up the disk space,
see the docs.

AFAIK the rename is not atomic across the cluster (that would require locks) so you best bet
would be to switch to a new CF in your code

Read and writes in Cassandra compete for resources (CPU and disk) but they will not block
each other as there is no locking system. You may find the performance acceptable, if not
just add more machines :)

Switching CF's may be a valid way to handle meta data bulk deletes, like horizontal partitions
in MS SQL and My SQL. Obviously it will deep end on how much data you have and how much capacity
you have.

Let us know how you get on.

Cheers
Aaron

On 11/02/2011, at 11:33 AM, Karl Hiramoto <karl@hiramoto.org> wrote:

> On 02/10/11 22:19, Aaron Morton wrote:
>> That should read "Without more information"
>> 
>> A
>> On 11 Feb, 2011,at 10:15 AM, Aaron Morton <aaron@thelastpickle.com> wrote:
>> 
>>> With more information I'd say this is not a good idea.
>>> 
>>> I would suggest looking at why you do the table switch in the MySql
>>> version and consider if it's still necessary in the Cassandra version.
>>> 
> I do the table switch because it's the fastest way to rebuild an entire
> dataset, Say your importing a flat CSV file, you have various cases.
> 
> 1. Exact same data loaded, only update timestamp.
> 2. new data that was not in previous dataset.
> 3. changed data from previous dataset (update)
> 4. Data that is not in new data, but is in old. (delete) Rebuilding
> the entire table saves millions of search/delete operations.
> 
> 
> In mysql reading/writing the table at the same time (Many millions of
> rows, many GB of data) slows things down beyond my strict performance
> requirements, doing the rename table, makes both the reads/writes much
> faster.. Yes, I know this probably doesn't apply to Cassandra.
> 
> If Cassandra could do something like the mysql rename it would avoid
> having to do the deletes on individual rows, or the repair/compaction of
> the column family to remove all the stale data. Disk space usage is
> also very important. I know after a new import is complete, all the
> old data is stale.
> 
>>> Could you use prefixes in your keys that the app knows about and
>>> switch those?
> Yes, but makes the app more complex, and needs to know when the data is
> consistent after the import. I think I would have to do a range scan
> to delete all the stale data.
> 
> A TTL would be risky as a TTL too high would waste disk space, and stale
> data would be around longer than wanted. A TTL too low would risk not
> having data available if a new import should fail, or be delayed.
> 
> 
> --
> Karl

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
    • Unnamed multipart/related (inline, None, 0 bytes)
View raw message