From user-return-61188-archive-asf-public=cust-asf.ponee.io@cassandra.apache.org Sat May 26 02:06:14 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 56947180627 for ; Sat, 26 May 2018 02:06:13 +0200 (CEST) Received: (qmail 76909 invoked by uid 500); 26 May 2018 00:06:11 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 76899 invoked by uid 99); 26 May 2018 00:06:11 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 26 May 2018 00:06:11 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 1EEE6C72AF for ; Sat, 26 May 2018 00:06:11 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.879 X-Spam-Level: * X-Spam-Status: No, score=1.879 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=backblaze.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id Th0GcM565P8y for ; Sat, 26 May 2018 00:06:09 +0000 (UTC) Received: from mail-qt0-f180.google.com (mail-qt0-f180.google.com [209.85.216.180]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 790A85F22E for ; Sat, 26 May 2018 00:06:09 +0000 (UTC) Received: by mail-qt0-f180.google.com with SMTP id h5-v6so598001qtm.13 for ; Fri, 25 May 2018 17:06:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=backblaze.com; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=yZ4DXrsv6Nsm1SBps/VHOx1z6G+/v7qOlrHrOIqvuGc=; b=IETAumpVdVwy/1RB01nAvRyKBirWngN7g2xRM96HyHIaTtMzi7SKflkLbLXSV7P4eq 6JZn92UuthLOHThcr4ymf1gIY1p5DVnaW2vzvSUtcMBMQA5tke8JCiGuvreNPrEzShO2 ooq+p4dRUxFBIhupLrFeuw7kjL+G3BLBDeGDpgsGVnJLgaG/ru6b7/Gkwgzc9P0nCAfZ aD2QGcvSHlxW8coo1HIXIw47YteOOMWCIDAkCgyf6txE9bB71vigXsa0vIncrJqLNEgY 5ANLLw4tHx8AsnJMsvs+o/LtWETT8FpwW5sFy9yJOrvlOSEBnqGJ7Emcj3gnmgznqDCd gOKw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=yZ4DXrsv6Nsm1SBps/VHOx1z6G+/v7qOlrHrOIqvuGc=; b=KFGFPsJmjeDxQD+dWb2fs5EnSwodD4E1XHYz0Zi3s3ncf/7tEFpB3AQ1JwIzqFk/r2 Qd4kpdcgE0XeN4e4VFfJmSEohxfSzURgIIkTh27cP+XN/Z2nct5dI3rIHrOufdI5bC93 zQ5BWxY1MFeSrjsiin3E55ZQQQk+tzj6Nzb8N0oUrIzFS4BLaGiDwYiRMXbKOIIyodMq om4FJoyDdebbDO121q3/+uJOYDRJbm/c6NEI2jpJCg/uBZa8Rp1/58xcMLNenq6XIg4B 9GdlFRBYCXPLJSO4aPAwtaCfvLRCtOwHPVHuxwjjOlG3kyPReaTtxItfuNI1Et9fZqAk 2KEA== X-Gm-Message-State: ALKqPwcNRnL5V5LRFq6pkb9jhDcvk7MAns1NBaHGBFDmXHzQ/qh7Go3c vHhKn3lIigUG4QOerL06DqaFcQyFJA8AVCjgDHiQlg== X-Google-Smtp-Source: ADUXVKLUWjD2MBVbavY9agt0rawKJOJ4pwxfG1T08lyf/lVRDp4q94tChAUI1LTfmVgmfkrcg+OfSm0eqSIFNmHjhL8= X-Received: by 2002:a0c:8abc:: with SMTP id 57-v6mr4282639qvv.147.1527293163228; Fri, 25 May 2018 17:06:03 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:ac8:1245:0:0:0:0:0 with HTTP; Fri, 25 May 2018 17:05:22 -0700 (PDT) In-Reply-To: References: <5A432615-CF92-4249-9182-799F2F812EC1@core43.com> From: Elliott Sims Date: Fri, 25 May 2018 17:05:22 -0700 Message-ID: Subject: Re: Snapshot SSTable modified?? To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary="000000000000766c21056d10a657" --000000000000766c21056d10a657 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable I've run across this problem before - it seems like GNU tar interprets changes in the link count as changes to the file, so if the file gets compacted mid-backup it freaks out even if the file contents are unchanged. I worked around it by just using bsdtar instead. On Thu, May 24, 2018 at 6:08 AM, Nitan Kainth wrote= : > Jeff, > > Shouldn't Snapshot get consistent state of sstables? -tmp file shouldn't > impact backup operation right? > > > Regards, > Nitan K. > Cassandra and Oracle Architect/SME > Datastax Certified Cassandra expert > Oracle 10g Certified > > On Wed, May 23, 2018 at 6:26 PM, Jeff Jirsa wrote: > >> In versions before 3.0, sstables were written with a -tmp filename and >> copied/moved to the final filename when complete. This changes in 3.0 - = we >> write into the file with the final name, and have a journal/log to let u= ss >> know when it's done/final/live. >> >> Therefore, you can no longer just watch for a -Data.db file to be create= d >> and uploaded - you have to watch the log to make sure it's not being >> written. >> >> >> On Wed, May 23, 2018 at 2:18 PM, Max C. wrote: >> >>> Hi Everyone, >>> >>> We=E2=80=99ve noticed a few times in the last few weeks that when we=E2= =80=99re doing >>> backups, tar has complained with messages like this: >>> >>> tar: /var/lib/cassandra/data/mars/test_instances_by_test_id-6a944 >>> 0a04cc111e8878675f1041d7e1c/snapshots/backup_20180523_024502/mb-63-big-= Data.db: >>> file changed as we read it >>> >>> Any idea what might be causing this? >>> >>> We=E2=80=99re running Cassandra 3.0.8 on RHEL 7. Here=E2=80=99s rough = pseudocode of our >>> backup process: >>> >>> >>> SNAPSHOT_NAME=3Dbackup_YYYMMDD_HHMMSS >>> nodetool snapshot -t $SNAPSHOT_NAME >>> >>> for each keyspace >>> - dump schema to =E2=80=9Cschema.cql" >>> - tar -czf /file_server/backup_$HOSTNAME_$KEYSPACE_YYYYMMDD_HHMMSS.tgz >>> schema.cql /var/lib/cassandra/data/$KEYSPACE/*/snapshots/$SNAPSHOT_NAME >>> >>> nodetool clearsnapshot -t $SNAPSHOT_NAME >>> >>> Thanks. >>> >>> - Max >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org >>> For additional commands, e-mail: user-help@cassandra.apache.org >>> >>> >> > --000000000000766c21056d10a657 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I've run across this problem before - it seems like GN= U tar interprets changes in the link count as changes to the file, so if th= e file gets compacted mid-backup it freaks out even if the file contents ar= e unchanged.=C2=A0 I worked around it by just using bsdtar instead.

On Thu, May 24,= 2018 at 6:08 AM, Nitan Kainth <nitankainth@gmail.com> w= rote:
Jeff,

Shouldn't Snapshot get consistent state of sstables? -tmp file s= houldn't impact backup operation right?


Regard= s,
Nitan K.
Cassandra and Oracle Architect/SME
Datas= tax Certified Cassandra expert
Oracle 10g Certified

On Wed, May 23, 2018 at 6:26 PM, Jeff Jirsa = <jjirsa@gmail.com> wrote:
<= div dir=3D"ltr">In versions before 3.0, sstables were written with a -tmp f= ilename and copied/moved to the final filename when complete. This changes = in 3.0 - we write into the file with the final name, and have a journal/log= to let uss know when it's done/final/live.

Therefor= e, you can no longer just watch for a -Data.db file to be created and uploa= ded - you have to watch the log to make sure it's not being written.


On Wed, May 23, 2018 at 2:18 PM, Max C. = <mc_cassand= ra@core43.com> wrote:
Hi Ev= eryone,

We=E2=80=99ve noticed a few times in the last few weeks that when we=E2=80= =99re doing backups, tar has complained with messages like this:

tar: /var/lib/cassandra/data/mars/test_instances_by_test_id-6a944= 0a04cc111e8878675f1041d7e1c/snapshots/backup_20180523_024502/mb-6= 3-big-Data.db: file changed as we read it

Any idea what might be causing this?

We=E2=80=99re running Cassandra 3.0.8 on RHEL 7.=C2=A0 Here=E2=80=99s rough= pseudocode of our backup process:

<cronjob set to fire same script at same time on all nodes>
SNAPSHOT_NAME=3Dbackup_YYYMMDD_HHMMSS
nodetool snapshot -t $SNAPSHOT_NAME

for each keyspace
- dump schema to =E2=80=9Cschema.cql"
- tar -czf /file_server/backup_$HOSTNAME_$KEYSPACE_YYYYMMDD_HHMMSS.tgz= schema.cql /var/lib/cassandra/data/$KEYSPACE/*/snapshots/$SNAPSHOT_NA= ME

nodetool clearsnapshot -t $SNAPSHOT_NAME

Thanks.

- Max
-----------------------------------------------------------------= ----
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org




--000000000000766c21056d10a657--