cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adarsh Kumar <adarsh0...@gmail.com>
Subject Re: Optimal backup strategy
Date Tue, 03 Dec 2019 07:11:32 GMT
Thanks Hossein,

Just one more question is there any special SOP or consideration we have to
take for multi-site backup.

Please share any helpful link, blog or steps documented.

Regards,
Adarsh Kumar

On Sun, Dec 1, 2019 at 10:40 PM Hossein Ghiyasi Mehr <ghiyasimehr@gmail.com>
wrote:

> 1. It's recommended to use commit log after one node failure. Cassandra
> has many options such as replication factor as substitute solution.
> 2. Yes, right.
>
> *VafaTech.com - A Total Solution for Data Gathering & Analysis*
>
>
> On Fri, Nov 29, 2019 at 9:33 AM Adarsh Kumar <adarsh0007@gmail.com> wrote:
>
>> Thanks Ahu and Hussein,
>>
>> So my understanding is:
>>
>>    1. Commit log backup is not documented for Apache Cassandra, hence
>>    not standard. But can be used for restore on the same machine (For taking
>>    backup from commit_log_dir). If used on other machine(s) has to be in the
>>    same topology. Can it be used for replacement node?
>>    2. For periodic backup Snapshot+Incremental backup is the best option
>>
>>
>> Thanks,
>> Adarsh Kumar
>>
>> On Fri, Nov 29, 2019 at 7:28 AM guo Maxwell <cclive1601@gmail.com> wrote:
>>
>>> Hossein is right , But for use , we restore to the same cassandra
>>> topology ,So it is usable to do replay .But when restore to the
>>> same machine it is also usable .
>>> Using sstableloader cost too much time and more storage(though will
>>> reduce after  restored)
>>>
>>> Hossein Ghiyasi Mehr <ghiyasimehr@gmail.com> 于2019年11月28日周四
下午7:40写道:
>>>
>>>> commitlog backup isn't usable in another machine.
>>>> Backup solution depends on what you want to do: periodic backup or
>>>> backup to restore on other machine?
>>>> Periodic backup is combine of snapshot and incremental backup. Remove
>>>> incremental backup after new snapshot.
>>>> Take backup to restore on other machine: You can use snapshot after
>>>> flushing memtable or Use sstableloader.
>>>>
>>>>
>>>> ----
>>>> VafaTech.com - A Total Solution for Data Gathering & Analysis
>>>>
>>>> On Thu, Nov 28, 2019 at 6:05 AM guo Maxwell <cclive1601@gmail.com>
>>>> wrote:
>>>>
>>>>> for cassandra or datastax's documentation, commitlog's backup is not
>>>>> mentioned.
>>>>> only snapshot and incremental backup is described to do backup .
>>>>>
>>>>> Though commitlog's archive for keyspace/table is not support but
>>>>> commitlog' replay (though you must put log to commitlog_dir and restart
the
>>>>> process)
>>>>> support the feature of keyspace/table' replay filter (using
>>>>> -Dcassandra.replayList with the keyspace1.table1,keyspace1.table2 format
to
>>>>> replay the specified keyspace/table)
>>>>>
>>>>> Snapshot do affect the storage, for us we got snapshot one week a time
>>>>> under the low business peak and making snapshot got throttle ,for you
you
>>>>> may
>>>>> see the issue (https://issues.apache.org/jira/browse/CASSANDRA-13019)
>>>>>
>>>>>
>>>>>
>>>>> Adarsh Kumar <adarsh0007@gmail.com> 于2019年11月28日周四
上午1:00写道:
>>>>>
>>>>>> Thanks Guo and Eric for replying,
>>>>>>
>>>>>> I have some confusions about commit log backup:
>>>>>>
>>>>>>    1. commit log archival technique is (
>>>>>>    https://support.datastax.com/hc/en-us/articles/115001593706-Manual-Backup-and-Restore-with-Point-in-time-and-table-level-restore-
>>>>>>    ) as good as an incremental backup, as it also captures commit
logs after
>>>>>>    memtable flush.
>>>>>>    2. If we go for "Snapshot + Incremental bk + Commit log", here
we
>>>>>>    have to take commit log from commit log directory (is there any
SOP for
>>>>>>    this?). As commit logs are not per table or ks, we will have chalange
in
>>>>>>    restoring selective tables.
>>>>>>    3. Snapshot based backups are easy to manage and operate due to
>>>>>>    its simplicity. But they are heavy on storage. Any views on this?
>>>>>>    4. Please share any successful strategy that someone is using
for
>>>>>>    production. We are still in the design phase and want to implement
the best
>>>>>>    solution.
>>>>>>
>>>>>> Thanks Eric for sharing link for medusa.
>>>>>>
>>>>>> Regards,
>>>>>> Adarsh Kumar
>>>>>>
>>>>>> On Wed, Nov 27, 2019 at 5:16 PM guo Maxwell <cclive1601@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> For me, I think the last one :
>>>>>>>  Snapshot + Incremental + commitlog
>>>>>>> is the most meaningful way to do backup and restore, when you
make
>>>>>>> the data backup to some where else like AWS S3.
>>>>>>>
>>>>>>>    - Snapshot based backup // for incremental data will not be
>>>>>>>    backuped and may lose data when restore to the time latter
than snapshot
>>>>>>>    time;
>>>>>>>    - Incremental backups // better than snapshot backup .but
>>>>>>>    with Insufficient data accuracy. For data remain in the memtable
will be
>>>>>>>    lose;
>>>>>>>    - Snapshot + incremental
>>>>>>>    - Snapshot + commitlog archival // better data precision than
>>>>>>>    made incremental backup, but the data in the non archived
commitlog(not
>>>>>>>    archive and commitlog log not closed) will not restore and
will lose. Also
>>>>>>>    when log is too much, do log reply will cost very mucu time
>>>>>>>
>>>>>>> For me ,We use snapshot + incremental + commitlog archive. We
read
>>>>>>> snapshot data and incremental data .Also the log is backuped
.But we will
>>>>>>> not backup the
>>>>>>> log whose data have been flush to sstable ,for the data will
be
>>>>>>> backuped by the way we do incremental backup .
>>>>>>>
>>>>>>> This way , the data will exist in the format of sstable trough
>>>>>>> snapshot backup and incremental backup . The log number will
be very small
>>>>>>> .And log replay will not cost much time.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Eric LELEU <eric@strapdata.com> 于2019年11月27日周三
下午4:13写道:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>> TheLastPickle & Spotify have released Medusa as Cassandra
Backup
>>>>>>>> tool.
>>>>>>>>
>>>>>>>> See :
>>>>>>>> https://thelastpickle.com/blog/2019/11/05/cassandra-medusa-backup-tool-is-open-source.html
>>>>>>>>
>>>>>>>> Hope this link will help you.
>>>>>>>>
>>>>>>>> Eric
>>>>>>>>
>>>>>>>>
>>>>>>>> Le 27/11/2019 à 08:10, Adarsh Kumar a écrit :
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I was looking for the backup strategies of Cassandra. After
some
>>>>>>>> study I came to know that there are the following options:
>>>>>>>>
>>>>>>>>    - Snapshot based backup
>>>>>>>>    - Incremental backups
>>>>>>>>    - Snapshot + incremental
>>>>>>>>    - Snapshot + commitlog archival
>>>>>>>>    - Snapshot + Incremental + commitlog
>>>>>>>>
>>>>>>>> Which is the most suitable and feasible approach? Also which
of
>>>>>>>> these is used most.
>>>>>>>> Please let me know if there is any other option to tool available.
>>>>>>>>
>>>>>>>> Thanks in advance.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Adarsh Kumar
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> you are the apple of my eye !
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> you are the apple of my eye !
>>>>>
>>>>
>>>
>>> --
>>> you are the apple of my eye !
>>>
>>

Mime
View raw message