Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id D9B2A200BF3 for ; Thu, 5 Jan 2017 08:53:55 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id D8386160B27; Thu, 5 Jan 2017 07:53:55 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id C82AD160B26 for ; Thu, 5 Jan 2017 08:53:54 +0100 (CET) Received: (qmail 32457 invoked by uid 500); 5 Jan 2017 07:53:53 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 32416 invoked by uid 99); 5 Jan 2017 07:53:53 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 05 Jan 2017 07:53:53 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id A0F11C1EDF for ; Thu, 5 Jan 2017 07:53:52 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.5 X-Spam-Level: ** X-Spam-Status: No, score=2.5 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_SORBS_SPAM=0.5, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=thelastpickle-com.20150623.gappssmtp.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id Ykz7WyH22o1Q for ; Thu, 5 Jan 2017 07:53:50 +0000 (UTC) Received: from mail-vk0-f44.google.com (mail-vk0-f44.google.com [209.85.213.44]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 955AD5FAE0 for ; Thu, 5 Jan 2017 07:53:49 +0000 (UTC) Received: by mail-vk0-f44.google.com with SMTP id p9so297237370vkd.3 for ; Wed, 04 Jan 2017 23:53:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=thelastpickle-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=G3P7Y6R0UiNf6IWWYtc5bwqWMlXQdOyC965jnXN5rwU=; b=HImWiY32chVhpEHdTG6KYdj3ehQ0u6Ij1E6FWjdoOjszlhVTGrxp6ytda/8aukJhsy jy6/ppfJksngdyeEdk2otgJwPYkPGvOF5zrDh1mCbQOOZCYeBgOJjyR+oVTv7Bs9SgVl JzOyKuaLhYfgRGy7tpYhRFImJxNUguynB7OMsxKsYG2fkO1F47vMzbbMQFLorNKtJurL diryfAM94ABeHd5fcV+ho6zaXeFnV8quX/1rYAhLKnCSOpSTgUHuVMXlF8bIGOIbJ7pH Yl3WeL3Fv0Q1CHC0JOH0hzmZOj2mqqeSRr0BHQUaD14f7bduvHAoBDgO61aqN0cEf38l nJuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=G3P7Y6R0UiNf6IWWYtc5bwqWMlXQdOyC965jnXN5rwU=; b=kT7lLTFdlJ8ispsLxXPvBUxFccE0iZIrVxCVnaeSeaxNlloWCSUSftGIyy8HOu3HsH odS3n0ie1SIU6SqxyYdZVp/FA5gAO1y8H7j1i783pln+wPyJTNOcFTO2uKnhAdX0C28/ 0mFeCuHqb9r8bSf4L0biZr8kfX6M8Pd9NYJkvB839t1pppdc8eDwX7nB73QInxEnetmD 3zzyyyvEXP79T+9nd+EqiQRyo9aKemFSGOPjHJvcSXrXpG96A6Ixbw2bk7EcGyUQmxnJ jRm8GdgRs0aRw0DjpPZGgDvVGDJtvAwIJCBWsmN8yJH7IM3R21HrAATdGqjQbHJU6Ieb QS7A== X-Gm-Message-State: AIkVDXJT1+SL4vhtt6zRTw98PsbZ7WjYz4Vvzg0Ek6i2Uk/l3IwFG7I0sZWvLao+VPMoaF7AWBb1iC2h1375uA== X-Received: by 10.31.54.73 with SMTP id d70mr119143vka.25.1483602828355; Wed, 04 Jan 2017 23:53:48 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Alexander Dejanovski Date: Thu, 05 Jan 2017 07:53:37 +0000 Message-ID: Subject: Re: Why compacting process uses more data that is expected To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=001a1143f1b8927a4f05455433b6 archived-at: Thu, 05 Jan 2017 07:53:56 -0000 --001a1143f1b8927a4f05455433b6 Content-Type: text/plain; charset=UTF-8 Indeed, nodetool compactionstats shows uncompressed sizes. As Oleksandr suggests, use the table compression ratio to compute the actual size on disk. It would actually be a great improvement for ops if we could add a switch to compactionstats in order to have the compression ratio applied automatically. On Thu, Jan 5, 2017 at 7:22 AM Oleksandr Shulgin < oleksandr.shulgin@zalando.de> wrote: > On Jan 4, 2017 17:58, "Jean Carlo" wrote: > > Hello guys > > I have a table with 34Gb of data in sstables (including tmp). And I can > see cassandra is doing some compactions on it. What surprissed me is that > nodetool compactionstats says he is compacting 138.66GB > > > root@node001 /root # nodetool compactionstats -H > pending tasks: 103 > * compaction type keyspace table > completed total unit progress* > Compaction keyspace1 table_02 112.74 GB > 138.66 GB bytes 81.31% > Active compaction remaining time : 0h03m27s > > So My question is, from where those 138.66GB come if my table has only > 34GB of data. > > > Hello, > > I believe that output of compactionstats shows you the size of > *uncompressed* data. Can you check (with nodetool tablestats) your > compression ratio? > > -- > Alex > > -- ----------------- Alexander Dejanovski France @alexanderdeja Consultant Apache Cassandra Consulting http://www.thelastpickle.com --001a1143f1b8927a4f05455433b6 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Indeed, nodetool compactionstats shows uncompressed sizes.=
As Oleksandr suggests, use the table compression ratio to compute the = actual size on disk.=C2=A0

It would actually be a = great improvement for ops if we could add a switch to compactionstats in or= der to have the compression ratio applied automatically.

On Thu, Jan 5, 2017 at 7:22 AM Ole= ksandr Shulgin <oleksand= r.shulgin@zalando.de> wrote:
On Jan 4, 2= 017 17:58, "Jean Carlo" <jean.jeancarl48@gmail.com= > wrote:
Hello guys

I hav= e a table with 34Gb of data in sstables (including tmp). And I can see cass= andra is doing some compactions on it. What surprissed me is that nodetool = compactionstats says he is compacting=C2=A0 138.66GB


root@node001= /root # nodetool compactionstats -H
pending tasks: = 103
=C2=A0=C2=A0 compaction t= ype=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 keyspace=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 table=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 completed=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 total=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 unit=C2=A0=C2=A0 progress=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Compacti= on=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = keyspace1 =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 table_02= =C2=A0=C2=A0 112.74 GB=C2=A0=C2=A0 138.66 GB=C2=A0=C2=A0 bytes=C2=A0=C2=A0= =C2=A0=C2=A0 81.31%
Active compaction remaining time= :=C2=A0=C2=A0 0h03m27s

So My question is, from where those 138.66GB come= if my table has only 34GB of data.

Hell= o,

I believe that output of=C2=A0compac= tionstats shows you the size of *uncompressed* data. Can you check (with no= detool tablestats) your compression ratio?

--
Alex

=
--
-----------------
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelas= tpickle.com
--001a1143f1b8927a4f05455433b6--