From user-return-27395-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Wed Jul 4 08:48:37 2012 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 937C89FF5 for ; Wed, 4 Jul 2012 08:48:37 +0000 (UTC) Received: (qmail 98940 invoked by uid 500); 4 Jul 2012 08:48:35 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 98757 invoked by uid 500); 4 Jul 2012 08:48:35 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 98728 invoked by uid 99); 4 Jul 2012 08:48:34 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Jul 2012 08:48:34 +0000 X-ASF-Spam-Status: No, hits=3.5 required=5.0 tests=FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,URI_HEX X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a82.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Jul 2012 08:48:26 +0000 Received: from homiemail-a82.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a82.g.dreamhost.com (Postfix) with ESMTP id 57776282072 for ; Wed, 4 Jul 2012 01:48:03 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=from :mime-version:content-type:subject:date:in-reply-to:to :references:message-id; q=dns; s=thelastpickle.com; b=v+CpDENlk+ XxOTDAQy+usEMjHUJj4jeHYHinw9tx2xVsX0IZyx36leJeGtpWyCFgJx+VguqEcj CHASw0IvL8WUy4nf0wxUoz4YIxDtOcZ4KgHPCv45o4ebTcE7Kn4AGYUGZPGtyMSh TaYMX2D4c/iqlqX+AWhLT/gzWJDe02LWo= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=from :mime-version:content-type:subject:date:in-reply-to:to :references:message-id; s=thelastpickle.com; bh=VQR8inbuOWN2biE2 dvh6kxKXdCw=; b=lPxPSyFWphdf5sGgTMfe27/Jtsx+UOCVLZ/nCt/W16a662Fo odWS9xh8W5ioCRI1+S14DuTff0JUZZDeBQ2dUJwuHsfzjCDx0M5eELsDHYLlAc+2 CoB/qwliZvEvWmCXmH91vkK7huqUfZydNFgW+krLOMK26JS9snNXsPyV6ZI= Received: from [172.16.1.4] (unknown [203.86.207.101]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a82.g.dreamhost.com (Postfix) with ESMTPSA id B4533282061 for ; Wed, 4 Jul 2012 01:48:01 -0700 (PDT) From: aaron morton Mime-Version: 1.0 (Apple Message framework v1278) Content-Type: multipart/alternative; boundary="Apple-Mail=_A6C0DF77-B153-4334-94A5-04DFE9409071" Subject: Re: BulkLoading SSTables and compression Date: Wed, 4 Jul 2012 20:47:56 +1200 In-Reply-To: <1341232376772-7580938.post@n2.nabble.com> To: user@cassandra.apache.org References: <1340877202875-7580849.post@n2.nabble.com> <52C2245F-95FB-4A92-BC0E-CAA9F53D9BC0@computing.dundee.ac.uk> <1341180890982-7580922.post@n2.nabble.com> <1341228243392-7580933.post@n2.nabble.com> <1341232376772-7580938.post@n2.nabble.com> Message-Id: X-Mailer: Apple Mail (2.1278) --Apple-Mail=_A6C0DF77-B153-4334-94A5-04DFE9409071 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii > The only thing I can think of is that the upgradesstables option = follows a > slightly different path to the bulk uploader when it comes to = generating the > sstables that have been flushed to disk? Seems unlikely, they both run through the same classes which determine = their compression strategy via configuration.=20 > However, prior to running the "upgradesstables" command, the total = size of > all the SSTables was 27GB and afterwards its 12GB. Do you have some of the log messages written when upgrade tables ran ? = They will be from compaction and come in pairs you can correlate to same = thread, one about what file is being compacted and another about how big = the new file is. =20 Do you have secondary indexes defined on the target CF ?=20 If you can reproduce (or at least explain it pretty well) it's probably = time to hit https://issues.apache.org/jira/browse/CASSANDRA Cheers ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 3/07/2012, at 12:32 AM, jmodha wrote: > Just to clarify, the data that we're loading SSTables from (v1.0.3) = doesn't > have compression enabled on any of the CF's.=20 >=20 > So in theory the compression should occur on the receiving end = (v1.1.1) as > we're going from uncompressed data to compressed data. >=20 > So I'm not sure if the bug you mention is causing the behaviour we're = seeing > here. >=20 > The only thing I can think of is that the upgradesstables option = follows a > slightly different path to the bulk uploader when it comes to = generating the > sstables that have been flushed to disk? >=20 > -- > View this message in context: = http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/BulkLoadi= ng-SSTables-and-compression-tp7580849p7580938.html > Sent from the cassandra-user@incubator.apache.org mailing list archive = at Nabble.com. --Apple-Mail=_A6C0DF77-B153-4334-94A5-04DFE9409071 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=us-ascii
The only thing I can think of is that = the upgradesstables option follows a
slightly different path to the = bulk uploader when it comes to generating the
sstables that have been = flushed to disk?
Seems unlikely, they both run = through the same classes which determine their compression strategy via = configuration. 

However, prior to running the "upgradesstables" = command, the total size of
all the SSTables was 27GB and afterwards = its 12GB.
Do you have some of the log messages = written when upgrade tables ran ? They will be from compaction and come = in pairs you can correlate to same thread, one about what file is being = compacted and another about how big the new file is. =   

Do you have secondary indexes = defined on the target CF ? 

If you can = reproduce (or at least explain it pretty well) it's probably time to hit =    https://issues.ap= ache.org/jira/browse/CASSANDRA

Cheers

http://www.thelastpickle.com

On 3/07/2012, at 12:32 AM, jmodha wrote:

Just = to clarify, the data that we're loading SSTables from (v1.0.3) = doesn't
have compression enabled on any of the CF's.

So in = theory the compression should occur on the receiving end (v1.1.1) = as
we're going from uncompressed data to compressed data.

So = I'm not sure if the bug you mention is causing the behaviour we're = seeing
here.

The only thing I can think of is that the = upgradesstables option follows a
slightly different path to the bulk = uploader when it comes to generating the
sstables that have been = flushed to disk?

--
View this message in context: http://cassand= ra-user-incubator-apache-org.3065146.n2.nabble.com/BulkLoading-SSTables-an= d-compression-tp7580849p7580938.html
Sent from the cassandra-user@incubat= or.apache.org mailing list archive at Nabble.com.
= --Apple-Mail=_A6C0DF77-B153-4334-94A5-04DFE9409071--