Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5623810B25 for ; Wed, 24 Jul 2013 16:43:35 +0000 (UTC) Received: (qmail 57240 invoked by uid 500); 24 Jul 2013 16:43:32 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 57027 invoked by uid 500); 24 Jul 2013 16:43:32 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 57015 invoked by uid 99); 24 Jul 2013 16:43:31 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 24 Jul 2013 16:43:31 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of paulingalls@gmail.com designates 209.85.192.182 as permitted sender) Received: from [209.85.192.182] (HELO mail-pd0-f182.google.com) (209.85.192.182) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 24 Jul 2013 16:43:26 +0000 Received: by mail-pd0-f182.google.com with SMTP id r10so593499pdi.27 for ; Wed, 24 Jul 2013 09:43:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:content-type:message-id:mime-version:subject:date:references :to:in-reply-to:x-mailer; bh=aarJq8p0vJ5+hPlZKor8KaX5VDxkXF72FUL2j5eu+Q4=; b=ehsps3XFbVJWDZacnKwfp5z6HDNdQKcsdE1riHbJtXl6Az61fA2JFfWocnSVNwUppz EtkXwDFLsDIYYNikq4Zijv7Rvl88kqdaOEi8ybdBqw8ceukWBOLtfdT7iSsWskI8FbET ni+T1jrSm49ADOLXylPAm7OF4d0LlAWQDdSZXfrtX3TAizGA3VDwPj4a0W2wIvEACtuz ReDgqgW90x/IssOfC5upRymwdK8xUHefC5BKlrRWkfW0xEwS+PHLssPFxl8WOi2Fx/mn 8hrslE2lk0Xyg2R9ao/jpOqpRqyOuDDn8z3ZR69+bXwkPraSYCXTlMgCYuTDWCa8dq5y VoWg== X-Received: by 10.66.194.13 with SMTP id hs13mr11690281pac.163.1374684186320; Wed, 24 Jul 2013 09:43:06 -0700 (PDT) Received: from [10.0.1.197] (c-24-19-184-229.hsd1.wa.comcast.net. [24.19.184.229]) by mx.google.com with ESMTPSA id iq3sm48852707pbb.20.2013.07.24.09.43.04 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 24 Jul 2013 09:43:05 -0700 (PDT) From: Paul Ingalls Content-Type: multipart/alternative; boundary="Apple-Mail=_A6894B5B-80BB-42EC-A7AD-93EE8BDF0E9A" Message-Id: <6C2FA25D-739A-4F41-B986-049D5EF0404C@gmail.com> Mime-Version: 1.0 (Mac OS X Mail 6.5 \(1508\)) Subject: Re: disappointed Date: Wed, 24 Jul 2013 09:43:03 -0700 References: <41C4F926-156E-4245-97A1-3D6CE35760C9@gmail.com> <00f801ce885e$b3864e70$1a92eb50$@struq.com> To: user@cassandra.apache.org In-Reply-To: <00f801ce885e$b3864e70$1a92eb50$@struq.com> X-Mailer: Apple Mail (2.1508) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_A6894B5B-80BB-42EC-A7AD-93EE8BDF0E9A Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=windows-1252 Hi Chris, Thanks for the response! What kind of challenges did you run into that kept you from using = collections? I currently and running 4 physical nodes, same as I was with case 1.1.6. = I'm using size tiered compaction. Would changing to level tiered with = a large minimum make a big difference, or would it just push the problem = off till later? Yeah, I have run into problems dropping schemas before as well. I was = careful this time to start with an empty db folder=85 Glad you were successful in your transition=85:) Paul On Jul 24, 2013, at 4:12 AM, "Christopher Wirt" = wrote: > Hi Paul, > =20 > Sorry to hear you=92re having a low point. > =20 > We ended up not using the collection features of 1.2. > Instead storing a compressed string containing the map and handling = client side. > =20 > We only have fixed schema short rows so no experience with large row = compaction. > =20 > File descriptors have never got that high for us. But, if you only = have a couple physical nodes with loads of data and small ss-tables = maybe they could get that high? > =20 > Only time I=92ve had file descriptors get out of hand was then = compaction got slightly confused with a new schema when I dropped and = recreated instead of truncating. = https://issues.apache.org/jira/browse/CASSANDRA-4857 restarting the node = fixed the issue. > =20 > =20 > =46rom my limited experience I think Cassandra is a dangerous choice = for an young limited funding/experience start-up expecting to scale = fast. We are a fairly mature start-up with funding. We=92ve just spent = 3-5 months moving from Mongo to Cassandra. It=92s been expensive and = painful getting Cassandra to read like Mongo, but we=92ve made it J > =20 > =20 > =20 > =20 > From: Paul Ingalls [mailto:paulingalls@gmail.com]=20 > Sent: 24 July 2013 06:01 > To: user@cassandra.apache.org > Subject: disappointed > =20 > I want to check in. I'm sad, mad and afraid. I've been trying to get = a 1.2 cluster up and working with my data set for three weeks with no = success. I've been running a 1.1 cluster for 8 months now with no = hiccups, but for me at least 1.2 has been a disaster. I had high hopes = for leveraging the new features of 1.2, specifically vnodes and = collections. But at this point I can't release my system into = production, and will probably need to find a new back end. As a small = startup, this could be catastrophic. I'm mostly mad at myself. I took = a risk moving to the new tech. I forgot sometimes when you gamble, you = lose. > =20 > First, the performance of 1.2.6 was horrible when using collections. = I wasn't able to push through 500k rows before the cluster became = unusable. With a lot of digging, and way too much time, I discovered I = was hitting a bug that had just been fixed, but was unreleased. This = scared me, because the release was already at 1.2.6 and I would have = expected something as = https://issues.apache.org/jira/browse/CASSANDRA-5677 would have been = addressed long before. But gamely I grabbed the latest code from the = 1.2 branch, built it and I was finally able to get past half a million = rows. =20 > =20 > But, then I hit ~4 million rows, and a multitude of problems. Even = with the fix above, I was still seeing a ton of compactions failing, = specifically the ones for large rows. Not a single large row will = compact, they all assert with the wrong size. Worse, and this is what = kills the whole thing, I keep hitting a wall with open files, even after = dumping the whole DB, dropping vnodes and trying again. Seriously, 650k = open file descriptors? When it hits this limit, the whole DB craps out = and is basically unusable. This isn't that many rows. I have close to = a half a billion in 1.1=85 > =20 > I'm now at a standstill. I figure I have two options unless someone = here can help me. Neither of them involve 1.2. I can either go back to = 1.1 and remove the features that collections added to my service, or I = find another data backend that has similar performance characteristics = to cassandra but allows collections type behavior in a scalable manner. = Cause as far as I can tell, 1.2 doesn't scale. Which makes me sad, I = was proud of what I accomplished with 1.1=85. > =20 > Does anyone know why there are so many open file descriptors? Any = ideas on why a large row won't compact? > =20 > Paul --Apple-Mail=_A6894B5B-80BB-42EC-A7AD-93EE8BDF0E9A Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=windows-1252
Hi = Chris,

Thanks for the = response!

What kind of challenges did you run = into that kept you from using collections?

I = currently and running 4 physical nodes, same as I was with case 1.1.6. =  I'm using size tiered compaction.  Would changing to level = tiered with a large minimum make a big difference, or would it just push = the problem off till later?

Yeah, I have run = into problems dropping schemas before as well.  I was careful this = time to start with an empty db folder=85

Glad = you were successful in your = transition=85:)

Paul

On Jul = 24, 2013, at 4:12 AM, "Christopher Wirt" <chris.wirt@struq.com> = wrote:

Hi = Paul,
 
Sorry to hear you=92re having a low = point.
 
We ended up not using the collection features = of 1.2.
Instead storing a compressed string containing the = map and handling client side.
 
We only have fixed = schema short rows so no experience with large row = compaction.
 
File descriptors have never got that high for = us. But, if you only have a couple physical nodes with loads of data and = small ss-tables maybe they could get that = high?
 
Only time I=92ve had file descriptors get out = of hand was then compaction got slightly confused with a new schema when = I dropped and recreated instead of truncating.  restarting the node fixed = the issue.
 Paul Ingalls = [mailto:paulingalls@gmail.com] 
Sent: 24 July 2013 = 06:01
To: user@cassandra.apache.orgSubject: disappointed
I want to = check in.  I'm sad, mad and afraid.  I've been trying to get a = 1.2 cluster up and working with my data set for three weeks with no = success.  I've been running a 1.1 cluster for 8 months now with no = hiccups, but for me at least 1.2 has been a disaster.  I had high = hopes for leveraging the new features of 1.2, specifically vnodes and = collections.   But at this point I can't release my system into = production, and will probably need to find a new back end.  As a = small startup, this could be catastrophic.  I'm mostly mad at = myself.  I took a risk moving to the new tech.  I forgot = sometimes when you gamble, you lose.
 
First, the performance of 1.2.6 was horrible when = using collections.  I wasn't able to push through 500k rows before = the cluster became unusable.  With a lot of digging, and way too = much time, I discovered I was hitting a bug that had just been fixed, = but was unreleased.  This scared me, because the release was = already at 1.2.6 and I would have expected something as