cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alain RODRIGUEZ <arodr...@gmail.com>
Subject Re: SS Table File Names not containing GUIDs
Date Tue, 17 May 2016 14:45:35 GMT
Hi,

I am wondering if there is any reason as to why the SS Table format doesn’t
> have a GUID


I don't know for sure, but what I can say is that GUID is often used to
solve the incremental issue on distributed system. SSTables are store on
one node, so increment works. So I would say this worked and was straight
forward. This is probably the reason. Plus sstables name / path are long
enough, I prefer to see '241' in there than
'c0629566-4a15-4db2-bb97-ee6e083de32b'.

Specifically, this causes some inconvenience when restoring snapshots.


This is true. Excepted in 5 years using Cassandra I restored snapshot maybe
twice. To feed staging (empty, so no issue) and to test recovery. So it is
not that often.

The problem is it is possible to overwrite new data with old files if the
> file names match. I can’t change the file names of snapshot-ed file to a
> huge number, because as soon as that file is copied over, C* will use that
> number in its get-next-number-gen logic potentially causing the same
> problem for the next snapshot-ed file.


What about using a lower value? Also if your value is really greater than
the current one, the risk is low, tables are being compacted often enough.
There are many relatively easy and working workaround here I believe. I
don't remember how I solved this though.

I would say I do not agree that we need to use GUID, but it is just my
opinion, if you fill this could be an improvement, search for a ticket
about that or fill up a new one.

C*heers,
-----------------------
Alain Rodriguez - alain@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2016-05-02 18:55 GMT+02:00 Anubhav Kale <Anubhav.Kale@microsoft.com>:

> Hello,
>
>
>
> I am wondering if there is any reason as to why the SS Table format
> doesn’t have a GUID. As far as I can tell, the incrementing number isn’t
> really used for any special purpose in code, and having a unique name for
> the file seems to be a better thing, in general.
>
>
>
> Specifically, this causes some inconvenience when restoring snapshots.
> Ideally, I would like to restore just the system* keyspaces and boot the
> node. Then, once the node is taking live traffic copy the SS Tables over
> and do a DSE restart at the end to load old data.
>
>
>
> The problem is it is possible to overwrite new data with old files if the
> file names match. I can’t change the file names of snapshot-ed file to a
> huge number, because as soon as that file is copied over, C* will use that
> number in its get-next-number-gen logic potentially causing the same
> problem for the next snapshot-ed file.
>
>
>
> How do people usually tackle this ? Is there some easy solution that I am
> not seeing ?
>
>
>
> Thanks !
>

Mime
View raw message