cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hossein Ghiyasi Mehr <ghiyasim...@gmail.com>
Subject Re: Optimal backup strategy
Date Sun, 01 Dec 2019 16:57:30 GMT
1. It's recommended to use commit log after one node failure. Cassandra has
many options such as replication factor as substitute solution.
2. Yes, right.

*VafaTech.com - A Total Solution for Data Gathering & Analysis*


On Fri, Nov 29, 2019 at 9:33 AM Adarsh Kumar <adarsh0007@gmail.com> wrote:

> Thanks Ahu and Hussein,
>
> So my understanding is:
>
>    1. Commit log backup is not documented for Apache Cassandra, hence not
>    standard. But can be used for restore on the same machine (For taking
>    backup from commit_log_dir). If used on other machine(s) has to be in the
>    same topology. Can it be used for replacement node?
>    2. For periodic backup Snapshot+Incremental backup is the best option
>
>
> Thanks,
> Adarsh Kumar
>
> On Fri, Nov 29, 2019 at 7:28 AM guo Maxwell <cclive1601@gmail.com> wrote:
>
>> Hossein is right , But for use , we restore to the same cassandra
>> topology ,So it is usable to do replay .But when restore to the
>> same machine it is also usable .
>> Using sstableloader cost too much time and more storage(though will
>> reduce after  restored)
>>
>> Hossein Ghiyasi Mehr <ghiyasimehr@gmail.com> 于2019年11月28日周四 下午7:40写道:
>>
>>> commitlog backup isn't usable in another machine.
>>> Backup solution depends on what you want to do: periodic backup or
>>> backup to restore on other machine?
>>> Periodic backup is combine of snapshot and incremental backup. Remove
>>> incremental backup after new snapshot.
>>> Take backup to restore on other machine: You can use snapshot after
>>> flushing memtable or Use sstableloader.
>>>
>>>
>>> ----
>>> VafaTech.com - A Total Solution for Data Gathering & Analysis
>>>
>>> On Thu, Nov 28, 2019 at 6:05 AM guo Maxwell <cclive1601@gmail.com>
>>> wrote:
>>>
>>>> for cassandra or datastax's documentation, commitlog's backup is not
>>>> mentioned.
>>>> only snapshot and incremental backup is described to do backup .
>>>>
>>>> Though commitlog's archive for keyspace/table is not support but
>>>> commitlog' replay (though you must put log to commitlog_dir and restart the
>>>> process)
>>>> support the feature of keyspace/table' replay filter (using
>>>> -Dcassandra.replayList with the keyspace1.table1,keyspace1.table2 format
to
>>>> replay the specified keyspace/table)
>>>>
>>>> Snapshot do affect the storage, for us we got snapshot one week a time
>>>> under the low business peak and making snapshot got throttle ,for you you
>>>> may
>>>> see the issue (https://issues.apache.org/jira/browse/CASSANDRA-13019)
>>>>
>>>>
>>>>
>>>> Adarsh Kumar <adarsh0007@gmail.com> 于2019年11月28日周四 上午1:00写道:
>>>>
>>>>> Thanks Guo and Eric for replying,
>>>>>
>>>>> I have some confusions about commit log backup:
>>>>>
>>>>>    1. commit log archival technique is (
>>>>>    https://support.datastax.com/hc/en-us/articles/115001593706-Manual-Backup-and-Restore-with-Point-in-time-and-table-level-restore-
>>>>>    ) as good as an incremental backup, as it also captures commit logs
after
>>>>>    memtable flush.
>>>>>    2. If we go for "Snapshot + Incremental bk + Commit log", here we
>>>>>    have to take commit log from commit log directory (is there any SOP
for
>>>>>    this?). As commit logs are not per table or ks, we will have chalange
in
>>>>>    restoring selective tables.
>>>>>    3. Snapshot based backups are easy to manage and operate due to
>>>>>    its simplicity. But they are heavy on storage. Any views on this?
>>>>>    4. Please share any successful strategy that someone is using for
>>>>>    production. We are still in the design phase and want to implement
the best
>>>>>    solution.
>>>>>
>>>>> Thanks Eric for sharing link for medusa.
>>>>>
>>>>> Regards,
>>>>> Adarsh Kumar
>>>>>
>>>>> On Wed, Nov 27, 2019 at 5:16 PM guo Maxwell <cclive1601@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> For me, I think the last one :
>>>>>>  Snapshot + Incremental + commitlog
>>>>>> is the most meaningful way to do backup and restore, when you make
>>>>>> the data backup to some where else like AWS S3.
>>>>>>
>>>>>>    - Snapshot based backup // for incremental data will not be
>>>>>>    backuped and may lose data when restore to the time latter than
snapshot
>>>>>>    time;
>>>>>>    - Incremental backups // better than snapshot backup .but
>>>>>>    with Insufficient data accuracy. For data remain in the memtable
will be
>>>>>>    lose;
>>>>>>    - Snapshot + incremental
>>>>>>    - Snapshot + commitlog archival // better data precision than
>>>>>>    made incremental backup, but the data in the non archived commitlog(not
>>>>>>    archive and commitlog log not closed) will not restore and will
lose. Also
>>>>>>    when log is too much, do log reply will cost very mucu time
>>>>>>
>>>>>> For me ,We use snapshot + incremental + commitlog archive. We read
>>>>>> snapshot data and incremental data .Also the log is backuped .But
we will
>>>>>> not backup the
>>>>>> log whose data have been flush to sstable ,for the data will be
>>>>>> backuped by the way we do incremental backup .
>>>>>>
>>>>>> This way , the data will exist in the format of sstable trough
>>>>>> snapshot backup and incremental backup . The log number will be very
small
>>>>>> .And log replay will not cost much time.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Eric LELEU <eric@strapdata.com> 于2019年11月27日周三
下午4:13写道:
>>>>>>
>>>>>>> Hi,
>>>>>>> TheLastPickle & Spotify have released Medusa as Cassandra
Backup
>>>>>>> tool.
>>>>>>>
>>>>>>> See :
>>>>>>> https://thelastpickle.com/blog/2019/11/05/cassandra-medusa-backup-tool-is-open-source.html
>>>>>>>
>>>>>>> Hope this link will help you.
>>>>>>>
>>>>>>> Eric
>>>>>>>
>>>>>>>
>>>>>>> Le 27/11/2019 à 08:10, Adarsh Kumar a écrit :
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I was looking for the backup strategies of Cassandra. After some
>>>>>>> study I came to know that there are the following options:
>>>>>>>
>>>>>>>    - Snapshot based backup
>>>>>>>    - Incremental backups
>>>>>>>    - Snapshot + incremental
>>>>>>>    - Snapshot + commitlog archival
>>>>>>>    - Snapshot + Incremental + commitlog
>>>>>>>
>>>>>>> Which is the most suitable and feasible approach? Also which
of
>>>>>>> these is used most.
>>>>>>> Please let me know if there is any other option to tool available.
>>>>>>>
>>>>>>> Thanks in advance.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Adarsh Kumar
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> you are the apple of my eye !
>>>>>>
>>>>>
>>>>
>>>> --
>>>> you are the apple of my eye !
>>>>
>>>
>>
>> --
>> you are the apple of my eye !
>>
>

Mime
View raw message