Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 88D301049F for ; Wed, 24 Jul 2013 13:43:11 +0000 (UTC) Received: (qmail 46648 invoked by uid 500); 24 Jul 2013 13:43:08 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 46534 invoked by uid 500); 24 Jul 2013 13:43:03 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 46520 invoked by uid 99); 24 Jul 2013 13:43:02 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 24 Jul 2013 13:43:02 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of fabien@yakaz.com designates 209.85.212.46 as permitted sender) Received: from [209.85.212.46] (HELO mail-vb0-f46.google.com) (209.85.212.46) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 24 Jul 2013 13:42:58 +0000 Received: by mail-vb0-f46.google.com with SMTP id w8so1440366vbf.19 for ; Wed, 24 Jul 2013 06:42:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:x-gm-message-state; bh=RYvxh1zkGbzM1TKJnTrvG/w479lWh8iOPGR8OQC3gc0=; b=XEF+x3cEyYedvkUMqd43ocbUXHeK6rE45bpAW77PSclfkedmW0a3PUFcfYbu3ir0nG 7P+yJE4CTLPZAZOMUZSWNf0MCjuGyEvHJn/XSCoRTFq9BJ6INAdUC7rZv/PJCDm3BJ8F ttPWpfarF6bmci2KCRASzo5Rbyscv0UuAvqxffRO/+a2NIYYhCimRPl+4nNkj+hdzzfc Tt5cpI/KI3BnvnvSHhmkxGenMV9Yo5hX+PaHzZW7eIBA0Hzq9X6ZM0bt38eHJBKnnDEz c+jiEgjNibPCzqmePis5khdoRQmS/ZTJLBIWvMmQuj2j5GjHIkHaO/u6B0zkBTzSpcQ0 fUrQ== X-Received: by 10.58.76.34 with SMTP id h2mr14180360vew.93.1374673355780; Wed, 24 Jul 2013 06:42:35 -0700 (PDT) MIME-Version: 1.0 Received: by 10.52.38.40 with HTTP; Wed, 24 Jul 2013 06:42:15 -0700 (PDT) In-Reply-To: <00f801ce885e$b3864e70$1a92eb50$@struq.com> References: <41C4F926-156E-4245-97A1-3D6CE35760C9@gmail.com> <00f801ce885e$b3864e70$1a92eb50$@struq.com> From: Fabien Rousseau Date: Wed, 24 Jul 2013 15:42:15 +0200 Message-ID: Subject: Re: disappointed To: user Content-Type: multipart/alternative; boundary=047d7b3437e40de0ae04e2421599 X-Gm-Message-State: ALoCoQlg4rdE0dBZB4PocwyW5vuSQ+YKNGcrVRaVNYQ0I1AsVBYKZQ8/ZYJFIk31CR+npS5PlnCJ X-Virus-Checked: Checked by ClamAV on apache.org --047d7b3437e40de0ae04e2421599 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Hi Paul, Concerning large rows which are not compacting, I've probably managed to reproduce your problem. I suppose you're using collections, but also TTLs ? Anyway, I opened an issue here : https://issues.apache.org/jira/browse/CASSANDRA-5799 Hope this helps 2013/7/24 Christopher Wirt > Hi Paul,**** > > ** ** > > Sorry to hear you=92re having a low point.**** > > ** ** > > We ended up not using the collection features of 1.2. **** > > Instead storing a compressed string containing the map and handling clien= t > side.**** > > ** ** > > We only have fixed schema short rows so no experience with large row > compaction.**** > > ** ** > > File descriptors have never got that high for us. But, if you only have a > couple physical nodes with loads of data and small ss-tables maybe they > could get that high?**** > > ** ** > > Only time I=92ve had file descriptors get out of hand was then compaction > got slightly confused with a new schema when I dropped and recreated > instead of truncating. > https://issues.apache.org/jira/browse/CASSANDRA-4857 restarting the node > fixed the issue.**** > > ** ** > > ** ** > > From my limited experience I think Cassandra is a dangerous choice for an > young limited funding/experience start-up expecting to scale fast. We are= a > fairly mature start-up with funding. We=92ve just spent 3-5 months moving > from Mongo to Cassandra. It=92s been expensive and painful getting Cassan= dra > to read like Mongo, but we=92ve made it J**** > > ** ** > > ** ** > > ** ** > > ** ** > > *From:* Paul Ingalls [mailto:paulingalls@gmail.com] > *Sent:* 24 July 2013 06:01 > *To:* user@cassandra.apache.org > *Subject:* disappointed**** > > ** ** > > I want to check in. I'm sad, mad and afraid. I've been trying to get a > 1.2 cluster up and working with my data set for three weeks with no > success. I've been running a 1.1 cluster for 8 months now with no hiccup= s, > but for me at least 1.2 has been a disaster. I had high hopes for > leveraging the new features of 1.2, specifically vnodes and collections. > But at this point I can't release my system into production, and will > probably need to find a new back end. As a small startup, this could be > catastrophic. I'm mostly mad at myself. I took a risk moving to the new > tech. I forgot sometimes when you gamble, you lose.**** > > ** ** > > First, the performance of 1.2.6 was horrible when using collections. I > wasn't able to push through 500k rows before the cluster became unusable. > With a lot of digging, and way too much time, I discovered I was hitting= a > bug that had just been fixed, but was unreleased. This scared me, becaus= e > the release was already at 1.2.6 and I would have expected something as > https://issues.apache.org/jira/browse/CASSANDRA-5677 would have been > addressed long before. But gamely I grabbed the latest code from the 1.2 > branch, built it and I was finally able to get past half a million rows. > **** > > ** ** > > But, then I hit ~4 million rows, and a multitude of problems. Even with > the fix above, I was still seeing a ton of compactions failing, > specifically the ones for large rows. Not a single large row will compac= t, > they all assert with the wrong size. Worse, and this is what kills the > whole thing, I keep hitting a wall with open files, even after dumping th= e > whole DB, dropping vnodes and trying again. Seriously, 650k open file > descriptors? When it hits this limit, the whole DB craps out and is > basically unusable. This isn't that many rows. I have close to a half a > billion in 1.1=85**** > > ** ** > > I'm now at a standstill. I figure I have two options unless someone here > can help me. Neither of them involve 1.2. I can either go back to 1.1 a= nd > remove the features that collections added to my service, or I find anoth= er > data backend that has similar performance characteristics to cassandra bu= t > allows collections type behavior in a scalable manner. Cause as far as I > can tell, 1.2 doesn't scale. Which makes me sad, I was proud of what I > accomplished with 1.1=85.**** > > ** ** > > Does anyone know why there are so many open file descriptors? Any ideas > on why a large row won't compact?**** > > ** ** > > Paul**** > --=20 Fabien Rousseau * * www.yakaz.com --047d7b3437e40de0ae04e2421599 Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: quoted-printable
Hi Paul,

Concerning large rows which ar= e not compacting, I've probably managed to reproduce your problem.
I suppose you're using collections, but also TTLs ?
Anyway, I opened an issue here :=A0https://issues.apache.org/jira/browse/C= ASSANDRA-5799=A0

Hope this helps


2013/7/24 Christopher Wirt <chris.wi= rt@struq.com>

Hi Paul,

=A0

Sorry to hear you=92re having a low point.=

=A0<= /p>

We ended up not using = the collection features of 1.2.

Instead storing a compres= sed string containing the map and handling client side.

=A0<= /p>

We only have fixed sch= ema short rows so no experience with large row compaction.

=A0<= /p>

File descriptors have = never got that high for us. But, if you only have a couple physical nodes w= ith loads of data and small ss-tables maybe they could get that high?

=A0<= /p>

Only time I=92ve had f= ile descriptors get out of hand was then compaction got slightly confused w= ith a new schema when I dropped and recreated instead of truncating. https://issues.apache.org/jira/browse/CASSANDRA-4857 restarting the n= ode fixed the issue.

=A0<= /p>

=A0

From my limited experienc= e I think Cassandra is a dangerous choice for an young limited funding/expe= rience start-up expecting to scale fast. We are a fairly mature start-up wi= th funding. We=92ve just spent 3-5 months moving from Mongo to Cassandra. I= t=92s been expensive and painful getting Cassandra to read like Mongo, but = we=92ve made it J

=A0<= /p>

=A0

=A0<= /p>

=A0

From: Paul Ingalls [mailto:paulingalls@gmail.com]
Sent: 24 July 2013 06:01
To: user@cassandra.apache.org
Subje= ct: disappointed

=A0

I want t= o check in. =A0I'm sad, mad and afraid. =A0I've been trying to get = a 1.2 cluster up and working with my data set for three weeks with no succe= ss. =A0I've been running a 1.1 cluster for 8 months now with no hiccups= , but for me at least 1.2 has been a disaster. =A0I had high hopes for leve= raging the new features of 1.2, specifically vnodes and collections. =A0 Bu= t at this point I can't release my system into production, and will pro= bably need to find a new back end. =A0As a small startup, this could be cat= astrophic. =A0I'm mostly mad at myself. =A0I took a risk moving to the = new tech. =A0I forgot sometimes when you gamble, you lose.

=A0

First, the performance of 1.2.6 was horrible when using collections= . =A0I wasn't able to push through 500k rows before the cluster became = unusable. =A0With a lot of digging, and way too much time, I discovered I w= as hitting a bug that had just been fixed, but was unreleased. =A0This scar= ed me, because the release was already at 1.2.6 and I would have expected s= omething as=A0https://issues.apache.org/jira/browse/CASSANDRA-5677<= /a>=A0would have been addressed long before. =A0But gamely I grabbed the la= test code from the 1.2 branch, built it and I was finally able to get past = half a million rows. =A0

<= br>

--
--047d7b3437e40de0ae04e2421599--