Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A3B59D76C for ; Tue, 21 Aug 2012 09:13:25 +0000 (UTC) Received: (qmail 11902 invoked by uid 500); 21 Aug 2012 09:13:23 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 11878 invoked by uid 500); 21 Aug 2012 09:13:23 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 11857 invoked by uid 99); 21 Aug 2012 09:13:22 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 21 Aug 2012 09:13:22 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of hsiao.chuanheng@gmail.com designates 209.85.214.172 as permitted sender) Received: from [209.85.214.172] (HELO mail-ob0-f172.google.com) (209.85.214.172) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 21 Aug 2012 09:13:15 +0000 Received: by obbwc20 with SMTP id wc20so12664300obb.31 for ; Tue, 21 Aug 2012 02:12:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=qhLu55yPrR7FCdN3Pr0f9E20+BKpGEHlJSKgMtOSuOc=; b=OVTVvXxsNtY3D9e7md+HbonBJkDSGM8LieRLEK7MDFvKjKw+8d5qZyqiercl1zVqJV Y7MrTdE73pOsFkZJwqNscyl1fV1ney0JCpu0bloorAREnU+JpAomvVN3C/8sIZK2rVap lUVFMcrH1QIz96ci+uovd+UJUKC3+VZjUqFbu5O6Ybu0ufK2J7ji0a16gvi+9CW2XqQq y/2Mt6kobPaxwa/5H+lkxkUJkJJIlHNJaP22nBScdVdWbUsAgiYfPGMDCZcZsXot+Zw5 3X8v4c/Jqr5L0NFagSiXC7erQKhzQA9M1wIWb57GBe07PB8wEeqmIRaNMZpCdFCg/TMB +nng== Received: by 10.182.177.7 with SMTP id cm7mr12373550obc.17.1345540374725; Tue, 21 Aug 2012 02:12:54 -0700 (PDT) MIME-Version: 1.0 Received: by 10.76.137.170 with HTTP; Tue, 21 Aug 2012 02:12:34 -0700 (PDT) In-Reply-To: <03D0C31A-E944-41B2-9217-B492B18BF238@thelastpickle.com> References: <5031C615.6070109@yahoo.com> <03D0C31A-E944-41B2-9217-B492B18BF238@thelastpickle.com> From: Chuan-Heng Hsiao Date: Tue, 21 Aug 2012 17:12:34 +0800 Message-ID: Subject: Re: Cassandra with large number of columns per row To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=e89a8f8396e31104d004c7c30847 --e89a8f8396e31104d004c7c30847 Content-Type: text/plain; charset=ISO-8859-1 Thank you very much! That also cleared my erroneous understanding of the size limitation before. Hsiao On Tue, Aug 21, 2012 at 5:03 PM, aaron morton wrote: > I think the limit of the size per row in cassandra is 2G? > > That was a pre 0.7 restriction > http://wiki.apache.org/cassandra/CassandraLimitations > > > In general you will want to avoid rows with more than say 32 or 64 MB of > data. It's not a hard restriction but big rows cause issues and it's often > easier to avoid them. > > Hope that helps. > ----------------- > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 20/08/2012, at 8:15 PM, Chuan-Heng Hsiao > wrote: > > I think the limit of the size per row in cassandra is 2G? > > 10000 x 1M = 10G. > > Hsiao > > On Mon, Aug 20, 2012 at 1:07 PM, oupfevph wrote: > >> I setup cassandra with default configuration in clean AWS instance, and I >> insert 10000 columns into a row, each column has a 1MB data. I use this >> ruby(version 1.9.3) script: >> >> 10000.times do >> key = rand(36**8).to_s(36) >> value = rand(36**1024).to_s(36) * 1024 >> Cas_client.insert(**TestColumnFamily,TestRow,{key=**>value}) >> end >> >> every time I run this script, it will crash: >> >> /usr/local/lib/ruby/gems/1.9.**1/gems/thrift-0.8.0/lib/** >> thrift/transport/socket.rb:**109:in `read': CassandraThrift::Cassandra::* >> *Client::TransportException >> from /usr/local/lib/ruby/gems/1.9.**1/gems/thrift-0.8.0/lib/** >> thrift/transport/base_**transport.rb:87:in `read_all' >> from /usr/local/lib/ruby/gems/1.9.**1/gems/thrift-0.8.0/lib/** >> thrift/transport/framed_**transport.rb:104:in `read_frame' >> from /usr/local/lib/ruby/gems/1.9.**1/gems/thrift-0.8.0/lib/** >> thrift/transport/framed_**transport.rb:69:in `read_into_buffer' >> from /usr/local/lib/ruby/gems/1.9.**1/gems/thrift-0.8.0/lib/**thrift/client.rb:45:in >> `read_message_begin' >> from /usr/local/lib/ruby/gems/1.9.**1/gems/thrift-0.8.0/lib/**thrift/client.rb:45:in >> `receive_message' >> from /usr/local/lib/ruby/gems/1.9.**1/gems/cassandra-0.15.0/** >> vendor/0.8/gen-rb/cassandra.**rb:251:in `recv_batch_mutate' >> from /usr/local/lib/ruby/gems/1.9.**1/gems/cassandra-0.15.0/** >> vendor/0.8/gen-rb/cassandra.**rb:243:in `batch_mutate' >> from /usr/local/lib/ruby/gems/1.9.**1/gems/thrift_client-0.8.1/** >> lib/thrift_client/abstract_**thrift_client.rb:150:in `handled_proxy' >> from /usr/local/lib/ruby/gems/1.9.**1/gems/thrift_client-0.8.1/** >> lib/thrift_client/abstract_**thrift_client.rb:60:in `batch_mutate' >> from /usr/local/lib/ruby/gems/1.9.**1/gems/cassandra-0.15.0/lib/* >> *cassandra/protocol.rb:7:in `_mutate' >> from /usr/local/lib/ruby/gems/1.9.**1/gems/cassandra-0.15.0/lib/* >> *cassandra/cassandra.rb:463:in `insert' >> from a.rb:6:in `block in
' >> from a.rb:3:in `times' >> from a.rb:3:in `
' >> >> yet cassandra performs normally, then I run another ruby script to get >> how many columns I have inserted: >> >> p cas_client.count_columns(**TestColumnFamily,TestRow) >> >> this script crashed again, same error message. And cassandra process >> remain in 100% cpu usage. >> >> >> AWS m1.xlarge type instance (15GB mem,800GB harddisk, 4cores cpu) >> cassandra-1.1.2 >> ruby-1.9.3-p194 >> jdk-7u6-linux-x64 >> ruby-gems: >> cassandra (0.15.0) >> thrift (0.8.0) >> thrift_client (0.8.1) >> >> What is the problem? >> >> > > --e89a8f8396e31104d004c7c30847 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Thank you very much!
That also cleared my=A0erroneous understanding of = the size limitation before.=A0

Hsiao
On Tue, Aug 21, 2012 at 5:03 PM, aaron morton <= span dir=3D"ltr"><aaron@thelastpickle.com> wrote:
I think the limit of the size per = row in cassandra is 2G?

<= /div>

In general you will want to avoid rows with more = than say 32 or 64 MB of data. It's not a hard restriction but big rows = cause issues and it's often easier to avoid them. =A0=A0

Hope that helps.=A0
<= div style=3D"word-wrap:break-word">
-----------------
Aaron Morton
Freelance Deve= loper
@aaronmorton

On 20/08/2012, at 8:15 PM, Chuan-Heng Hsiao <hsiao.chuanheng@gmail.com= > wrote:

I think the limit o= f the size per row in cassandra is 2G?

10000 x 1M =3D 10G.

Hsiao

On Mon, Aug 20, 2012 at 1:07 PM,= oupfevph <oupfevph@yahoo.com> wrote:
I setup cassandra with default configuration in clean AWS = instance, and I insert 10000 columns into a row, each column has a 1MB data= . I use this ruby(version 1.9.3) script:

=A0 =A0 10000.times do
=A0 =A0 =A0 =A0 key =3D rand(36**8).to_s(36)
=A0 =A0 =A0 =A0 value =3D rand(36**1024).to_s(36) * 1024
=A0 =A0 =A0 =A0 Cas_client.insert(TestColumnFamily,TestRow,{key=3D>value})
=A0 =A0 end

every time I run this script, it will crash:

/usr/local/lib/ruby/gems/1.9.1/gems/thrift-0.8.0/lib/thrift/t= ransport/socket.rb:109:in `read': CassandraThrift::Cassandra::Client::TransportException
=A0 =A0 =A0 =A0 from /usr/local/lib/ruby/gems/1.9.1/gems/thrift-0.8.= 0/lib/thrift/transport/base_transport.rb:87:in `read_all'=
=A0 =A0 =A0 =A0 from /usr/local/lib/ruby/gems/1.9.1/gems/thrift-0.8.= 0/lib/thrift/transport/framed_transport.rb:104:in `read_frame= '
=A0 =A0 =A0 =A0 from /usr/local/lib/ruby/gems/1.9.1/gems/thrift-0.8.= 0/lib/thrift/transport/framed_transport.rb:69:in `read_into_b= uffer'
=A0 =A0 =A0 =A0 from /usr/local/lib/ruby/gems/1.9.1/gems/thrift-0.8.= 0/lib/thrift/client.rb:45:in `read_message_begin'
=A0 =A0 =A0 =A0 from /usr/local/lib/ruby/gems/1.9.1/gems/thrift-0.8.= 0/lib/thrift/client.rb:45:in `receive_message'
=A0 =A0 =A0 =A0 from /usr/local/lib/ruby/gems/1.9.1/gems/cassandra-0= .15.0/vendor/0.8/gen-rb/cassandra.rb:251:in `recv_batch_mutat= e'
=A0 =A0 =A0 =A0 from /usr/local/lib/ruby/gems/1.9.1/gems/cassandra-0= .15.0/vendor/0.8/gen-rb/cassandra.rb:243:in `batch_mutate'= ;
=A0 =A0 =A0 =A0 from /usr/local/lib/ruby/gems/1.9.1/gems/thrift_clie= nt-0.8.1/lib/thrift_client/abstract_thrift_client.rb:150:in `= handled_proxy' =A0 =A0 =A0 =A0from /usr/local/lib/ruby/gems/1.9.= 1/gems/thrift_client-0.8.1/lib/thrift_client/abstract_thrift_= client.rb:60:in `batch_mutate'
=A0 =A0 =A0 =A0 from /usr/local/lib/ruby/gems/1.9.1/gems/cassandra-0= .15.0/lib/cassandra/protocol.rb:7:in `_mutate'
=A0 =A0 =A0 =A0 from /usr/local/lib/ruby/gems/1.9.1/gems/cassandra-0= .15.0/lib/cassandra/cassandra.rb:463:in `insert'
=A0 =A0 =A0 =A0 from a.rb:6:in `block in <main>'
=A0 =A0 =A0 =A0 from a.rb:3:in `times'
=A0 =A0 =A0 =A0 from a.rb:3:in `<main>'

yet cassandra performs normally, then I run another ruby script to get how = many columns I have inserted:

=A0 =A0 p cas_client.count_columns(TestColumnFamily,TestRow)

this script crashed again, same error message. And cassandra process remain= in 100% cpu usage.


=A0 =A0 AWS m1.xlarge type instance (15GB mem,800GB harddisk, 4cores cpu) =A0 =A0 cassandra-1.1.2
=A0 =A0 ruby-1.9.3-p194
=A0 =A0 jdk-7u6-linux-x64
=A0 =A0 ruby-gems:
=A0 =A0 =A0 =A0 cassandra (0.15.0)
=A0 =A0 =A0 =A0 thrift (0.8.0)
=A0 =A0 =A0 =A0 thrift_client (0.8.1)

What is the problem?




--e89a8f8396e31104d004c7c30847--