cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Max C <mc_cassan...@core43.com>
Subject Re: Snapshot SSTable modified??
Date Sat, 26 May 2018 03:00:54 GMT
I looked at the source code for GNU tar, and it looks for a change in the create time or (more
likely) a change in the size.

This seems very strange to me — I would think that creating a snapshot would cause a flush
and then once the SSTables are written, hardlinks would be created and the SSTables wouldn't
be written to after that.

Our solution is to wait 5 minutes and retry the tar if an error occurs.  This isn't ideal
- but it's the best I could come up with.  :-/

Thanks Jeff & others for your responses.

- Max

> On May 25, 2018, at 5:05pm, Elliott Sims <elliott@backblaze.com> wrote:
> 
> I've run across this problem before - it seems like GNU tar interprets changes in the
link count as changes to the file, so if the file gets compacted mid-backup it freaks out
even if the file contents are unchanged.  I worked around it by just using bsdtar instead.
> 
> On Thu, May 24, 2018 at 6:08 AM, Nitan Kainth <nitankainth@gmail.com <mailto:nitankainth@gmail.com>>
wrote:
> Jeff,
> 
> Shouldn't Snapshot get consistent state of sstables? -tmp file shouldn't impact backup
operation right?
> 
> 
> Regards,
> Nitan K.
> Cassandra and Oracle Architect/SME
> Datastax Certified Cassandra expert
> Oracle 10g Certified
> 
> On Wed, May 23, 2018 at 6:26 PM, Jeff Jirsa <jjirsa@gmail.com <mailto:jjirsa@gmail.com>>
wrote:
> In versions before 3.0, sstables were written with a -tmp filename and copied/moved to
the final filename when complete. This changes in 3.0 - we write into the file with the final
name, and have a journal/log to let uss know when it's done/final/live.
> 
> Therefore, you can no longer just watch for a -Data.db file to be created and uploaded
- you have to watch the log to make sure it's not being written.
> 
> 
> On Wed, May 23, 2018 at 2:18 PM, Max C. <mc_cassandra@core43.com <mailto:mc_cassandra@core43.com>>
wrote:
> Hi Everyone,
> 
> We’ve noticed a few times in the last few weeks that when we’re doing backups, tar
has complained with messages like this:
> 
> tar: /var/lib/cassandra/data/mars/test_instances_by_test_id-6a9440a04cc111e8878675f1041d7e1c/snapshots/backup_20180523_024502/mb-63-big-Data.db:
file changed as we read it
> 
> Any idea what might be causing this?
> 
> We’re running Cassandra 3.0.8 on RHEL 7.  Here’s rough pseudocode of our backup process:
> 
> <cronjob set to fire same script at same time on all nodes>
> SNAPSHOT_NAME=backup_YYYMMDD_HHMMSS
> nodetool snapshot -t $SNAPSHOT_NAME
> 
> for each keyspace
> - dump schema to “schema.cql"
> - tar -czf /file_server/backup_$HOSTNAME_$KEYSPACE_YYYYMMDD_HHMMSS.tgz schema.cql /var/lib/cassandra/data/$KEYSPACE/*/snapshots/$SNAPSHOT_NAME
> 
> nodetool clearsnapshot -t $SNAPSHOT_NAME
> 
> Thanks.
> 
> - Max
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org <mailto:user-unsubscribe@cassandra.apache.org>
> For additional commands, e-mail: user-help@cassandra.apache.org <mailto:user-help@cassandra.apache.org>
> 
> 
> 
> 


Mime
View raw message