From dev-return-112385-archive-asf-public=cust-asf.ponee.io@cloudstack.apache.org Tue Jan 22 17:30:39 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 7F663180771 for ; Tue, 22 Jan 2019 17:30:38 +0100 (CET) Received: (qmail 12293 invoked by uid 500); 22 Jan 2019 16:30:36 -0000 Mailing-List: contact dev-help@cloudstack.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cloudstack.apache.org Delivered-To: mailing list dev@cloudstack.apache.org Received: (qmail 12271 invoked by uid 99); 22 Jan 2019 16:30:35 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Jan 2019 16:30:35 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 8D15C1805DA; Tue, 22 Jan 2019 16:30:35 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.999 X-Spam-Level: * X-Spam-Status: No, score=1.999 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id YjMHGOhFjkLK; Tue, 22 Jan 2019 16:30:33 +0000 (UTC) Received: from sea.ippathways.com (sea.ippathways.com [192.103.11.145]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 9369F5FE0C; Tue, 22 Jan 2019 16:30:33 +0000 (UTC) Received: from sea.ippathways.com (localhost.localdomain [127.0.0.1]) by localhost (Email Security Appliance) with SMTP id 7556910434E_C474521B; Tue, 22 Jan 2019 16:30:25 +0000 (GMT) Received: from smtp.ippathways.com (ippexch13mb1.ipp.corp [10.1.1.31]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (Client CN "*.ippathways.com", Issuer "COMODO RSA Domain Validation Secure Server CA" (not verified)) by sea.ippathways.com (Sophos Email Appliance) with ESMTPS id DF4EC1019B3_C47451FF; Tue, 22 Jan 2019 16:30:23 +0000 (GMT) From: Sean Lair To: "users@cloudstack.apache.org" , "dev@cloudstack.apache.org" Subject: Snapshots on KVM corrupting disk images Thread-Topic: Snapshots on KVM corrupting disk images Thread-Index: AdSyb4xmAzu9qLF/SxC+VI/47ZPPKA== Date: Tue, 22 Jan 2019 16:30:22 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [207.45.81.203] Content-Type: multipart/alternative; boundary="_000_cb737633a9134ae48e816ba183c97d11IPPEXCH13MB1ippcorp_" MIME-Version: 1.0 X-SASI-RCODE: 200 --_000_cb737633a9134ae48e816ba183c97d11IPPEXCH13MB1ippcorp_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi all, We had some instances where VM disks are becoming corrupted when using KVM = snapshots. We are running CloudStack 4.9.3 with KVM on CentOS 7. The first time was when someone mass-enabled scheduled snapshots on a lot o= f large number VMs and secondary storage filled up. We had to restore all = those VM disks... But believed it was just our fault with letting secondar= y storage fill up. Today we had an instance where a snapshot failed and now the disk image is = corrupted and the VM can't boot. here is the output of some commands: ----------------------- [root@cloudkvm02 c3be0ae5-2248-3ed6-a0c7-acffe25cc8d3]# qemu-img check ./18= 4aa458-9d4b-4c1b-a3c6-23d28ea28e80 qemu-img: Could not open './184aa458-9d4b-4c1b-a3c6-23d28ea28e80': Could no= t read snapshots: File too large [root@cloudkvm02 c3be0ae5-2248-3ed6-a0c7-acffe25cc8d3]# qemu-img info ./184= aa458-9d4b-4c1b-a3c6-23d28ea28e80 qemu-img: Could not open './184aa458-9d4b-4c1b-a3c6-23d28ea28e80': Could no= t read snapshots: File too large [root@cloudkvm02 c3be0ae5-2248-3ed6-a0c7-acffe25cc8d3]# ls -lh ./184aa458-9= d4b-4c1b-a3c6-23d28ea28e80 -rw-r--r--. 1 root root 73G Jan 22 11:04 ./184aa458-9d4b-4c1b-a3c6-23d28ea2= 8e80 ----------------------- We tried restoring to before the snapshot failure, but still have strange e= rrors: ---------------------- [root@cloudkvm02 c3be0ae5-2248-3ed6-a0c7-acffe25cc8d3]# ls -lh ./184aa458-9= d4b-4c1b-a3c6-23d28ea28e80 -rw-r--r--. 1 root root 73G Jan 22 11:04 ./184aa458-9d4b-4c1b-a3c6-23d28ea2= 8e80 [root@cloudkvm02 c3be0ae5-2248-3ed6-a0c7-acffe25cc8d3]# qemu-img info ./184= aa458-9d4b-4c1b-a3c6-23d28ea28e80 image: ./184aa458-9d4b-4c1b-a3c6-23d28ea28e80 file format: qcow2 virtual size: 50G (53687091200 bytes) disk size: 73G cluster_size: 65536 Snapshot list: ID TAG VM SIZE DATE VM CLOCK 1 a8fdf99f-8219-4032-a9c8-87a6e09e7f95 3.7G 2018-12-23 11:01:43 3= 099:35:55.242 2 b4d74338-b0e3-4eeb-8bf8-41f6f75d9abd 3.8G 2019-01-06 11:03:16 3= 431:52:23.942 Format specific information: compat: 1.1 lazy refcounts: false [root@cloudkvm02 c3be0ae5-2248-3ed6-a0c7-acffe25cc8d3]# qemu-img check ./18= 4aa458-9d4b-4c1b-a3c6-23d28ea28e80 tcmalloc: large alloc 1539750010880 bytes =3D=3D (nil) @ 0x7fb9cbbf7bf3 0x= 7fb9cbc19488 0x7fb9cb71dc56 0x55d16ddf1c77 0x55d16ddf1edc 0x55d16ddf2541 0x= 55d16ddf465e 0x55d16ddf8ad1 0x55d16de336db 0x55d16de373e6 0x7fb9c63a3c05 0x= 55d16ddd9f7d No errors were found on the image. [root@cloudkvm02 c3be0ae5-2248-3ed6-a0c7-acffe25cc8d3]# qemu-img snapshot -= l ./184aa458-9d4b-4c1b-a3c6-23d28ea28e80 Snapshot list: ID TAG VM SIZE DATE VM CLOCK 1 a8fdf99f-8219-4032-a9c8-87a6e09e7f95 3.7G 2018-12-23 11:01:43 3= 099:35:55.242 2 b4d74338-b0e3-4eeb-8bf8-41f6f75d9abd 3.8G 2019-01-06 11:03:16 3= 431:52:23.942 -------------------------- Everyone is now extremely hesitant to use snapshots in KVM.... We tried de= leting the snapshots in the restored disk image, but it errors out... Does anyone else have issues with KVM snapshots? We are considering just d= isabling this functionality now... Thanks Sean --_000_cb737633a9134ae48e816ba183c97d11IPPEXCH13MB1ippcorp_--