Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 88128 invoked from network); 9 Nov 2010 19:53:04 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 9 Nov 2010 19:53:04 -0000 Received: (qmail 96142 invoked by uid 500); 9 Nov 2010 19:53:33 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 96120 invoked by uid 500); 9 Nov 2010 19:53:33 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 96112 invoked by uid 99); 9 Nov 2010 19:53:33 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Nov 2010 19:53:33 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,MIME_QP_LONG_LINE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a52.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Nov 2010 19:53:27 +0000 Received: from homiemail-a52.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a52.g.dreamhost.com (Postfix) with ESMTP id DCD696B8169 for ; Tue, 9 Nov 2010 11:53:05 -0800 (PST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=to:from :subject:date:message-id:content-type:mime-version:in-reply-to; q=dns; s=thelastpickle.com; b=IdJuLVr1l7S61B2k98pjcOMyEW39bwDvG o4Gbm6assz7NNpcqFw8nh0dqJkaD9TkG8Qt6Lgd/1Kc9HJ9TZSEq5b8LGZHs+noa lF5+8pKTIr/V+l6fKonEH9M6VppoMSrlCfJKqrAXynADVve15/dyFgilHR2/0EJw Bk94TWqj7A= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=to :from:subject:date:message-id:content-type:mime-version: in-reply-to; s=thelastpickle.com; bh=hIs7hmGHq4+4BzbhAVsgA5kXiIY =; b=aJWVTzVL/UL/+WuSq1hEMP5Jz5lOZ3tZ6aTH+S7Ez8IHXg/c905l4bjvssA yEOZgzVN/mOQS9NSwtieGXPC9r/IyYBgR4H7riI0RjhNqfUzIAgYSDGfwJhl0EIU RTMpp0mktlD2Qhtc5TlgY76/mT+Zu5QP8Ry6NdrnTfIhsSq8= Received: from localhost (webms.mac.com [17.148.16.123]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a52.g.dreamhost.com (Postfix) with ESMTPSA id C73446B8164 for ; Tue, 9 Nov 2010 11:53:05 -0800 (PST) To: user@cassandra.apache.org From: Aaron Morton Subject: Re: Bulk insertion in Cassandra 0.7 beta3 Date: Tue, 09 Nov 2010 19:53:04 GMT X-Mailer: MobileMe Mail (1C3207) Message-id: Content-Type: multipart/alternative; boundary=Apple-Webmail-42--81bc2e1e-996e-ee7d-adbd-f669d55ec876 MIME-Version: 1.0 In-Reply-To: --Apple-Webmail-42--81bc2e1e-996e-ee7d-adbd-f669d55ec876 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=ISO-8859-1; format=flowed The node(s) you are connecting to have disappeared, look at the server sid= e logs for errors. It's probably running out of memory, if so turn off thi= ngs like the key cache and key_cache_save_period until you have a stable s= ystem, then gradually turn them back on.=A0=0A=0AYou may also want to have= a read of this=A0http://wiki.apache.org/cassandra/FAQ#slows_down_after_lo= tso_inserts=0A=A0=0AAs for the throughput, it depends on the HW, how you a= re doing the inserts, how many clients you have, how big the columns are a= nd a bunch of other things. But 2GB RAM and 1GB Heap is at the low end of = the scale.=A0=0A=0AHope that helps.=A0=0AAaron=0A=0AOn 10 Nov, 2010,at 08:= 26 AM, Tomas Zulberti wrote:=0A=0AWe are making some= tests using 3 nodes: A, B, C. We are bulk inserting=0A87500 keys, and for= each of them 1 super column with 768 columns.=0AWe are using hector 0.7.0= -18 to insert the data, and at some point an=0Aexception is raised, and so= metimes cassandra deamon stop running in=0Aone of the nodes.=0AThe nodes h= as 2gb of RAM, so the JVM heap is 1gb, and the CPU load=0Agoes up to 80%.=0A= =0AIt is possible to insert that ammount of data every 10 minutes? That=0A= would be our use case scenario. We are newbies in cassandra, so maybe=0Awe= must take a different approach. What do you suggest?=0A=0AThe keyspace co= nfiguration is:=0A- name: TestKeyspace=0Areplica_placement_strategy: org.a= pache.cassandra.locator.SimpleStrategy=0Areplication_factor: 3=0Acolumn_fa= milies:=0A- name: TestFamily=0Acolumn_type: Super=0Acompare_with: LongType= =0Acompare_subcolumns_with: UTF8Type=0Akeys_cached: 200000=0Arows_cached: = 0=0Akey_cache_save_period_in_seconds: 3600=0Arow_cache_save_period_in_seco= nds: 0=0Amemtable_flush_after_mins: 3600=0Amemtable_throughput_in_mb: 80=0A= memtable_operations_in_millions: 0.10=0A=0Athe seeds configuration for eac= h node in each machine:=0Aseeds:=0A- A=0A=0Aand the exception that is rais= ed is:=0AException in thread "main"=0Ame.prettyprint.hector.api.exceptions= HUnavailableException:=0AUnavailableException()=0Aat me.prettyprint.cassa= ndra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.j= ava:36)=0Aat me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execut= e(KeyspaceServiceImpl.java:88)=0Aat me.prettyprint.cassandra.service.Keysp= aceServiceImpl$1.execute(KeyspaceServiceImpl.java:81)=0Aat me.prettyprint.= cassandra.service.Operation.executeAndSetResult(FailoverOperator.java:388)= =0Aat me.prettyprint.cassandra.service.FailoverOperator.operateSingleItera= tion(FailoverOperator.java:194)=0Aat me.prettyprint.cassandra.service.Fail= overOperator.operate(FailoverOperator.java:99)=0Aat me.prettyprint.cassand= ra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.jav= a:123)=0Aat me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMuta= te(KeyspaceServiceImpl.java:93)=0Aat me.prettyprint.cassandra.service.Keys= paceServiceImpl.batchMutate(KeyspaceServiceImpl.java:99)=0Aat me.prettypri= nt.cassandra.model.MutatorImpl$2.doInKeyspace(MutatorImpl.java:142)=0Aat m= e.prettyprint.cassandra.model.MutatorImpl$2.doInKeyspace(MutatorImpl.java:= 139)=0Aat me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKey= spaceAndMeasure(KeyspaceOperationCallback.java:20)=0Aat me.prettyprint.cas= sandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:58)=0Aat m= e.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:139)=0A= at com.popego.benchmarks.InsertSegmentsHits.batchInsert(InsertSegmentsHits= java:154)=0Aat com.popego.benchmarks.InsertSegmentsHits.insertData(Insert= SegmentsHits.java:131)=0Aat com.popego.benchmarks.InsertSegmentsHits.main(= InsertSegmentsHits.java:177)=0ACaused by: UnavailableException()=0Aat org.= apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:= 16633)=0Aat org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate= (Cassandra.java:935)=0Aat org.apache.cassandra.thrift.Cassandra$Client.bat= ch_mutate(Cassandra.java:909)=0Aat me.prettyprint.cassandra.service.Keyspa= ceServiceImpl$1.execute(KeyspaceServiceImpl.java:86)=0A... 15 more=0A --Apple-Webmail-42--81bc2e1e-996e-ee7d-adbd-f669d55ec876 Content-Type: multipart/related; type="text/html"; boundary=Apple-Webmail-86--81bc2e1e-996e-ee7d-adbd-f669d55ec876 --Apple-Webmail-86--81bc2e1e-996e-ee7d-adbd-f669d55ec876 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=ISO-8859-1;
The node(s) you are connecting to have dis= appeared, look at the server side logs for errors. It's probably running o= ut of memory, if so turn off things like the key cache and key_cache_save_= period until you have a stable system, then gradually turn them back on.&n= bsp;

You may also want to have a read of this&nbs= p;http://wiki.apache.org/cassandra/FAQ#slows_down_after_lotso_insert= s
 
As for the throughput, it depe= nds on the HW, how you are doing the inserts, how many clients you have, h= ow big the columns are and a bunch of other things. But 2GB RAM and 1GB He= ap is at the low end of the scale. 

Hope tha= t helps. 
Aaron

On 10 Nov, 2010,= at 08:26 AM, Tomas Zulberti <tzulberti@gmail.com> wrote:

We are makin= g some tests using 3 nodes: A, B, C. We are bulk inserting
=0A87500 key= s, and for each of them 1 super column with 768 columns.
=0AWe are usin= g hector 0.7.0-18 to insert the data, and at some point an
=0Aexception= is raised, and sometimes cassandra deamon stop running in
=0Aone of th= e nodes.
=0AThe nodes has 2gb of RAM, so the JVM heap is 1gb, and the C= PU load
=0Agoes up to 80%.
=0A
=0AIt is possible to insert that a= mmount of data every 10 minutes? That
=0Awould be our use case scenario= We are newbies in cassandra, so maybe
=0Awe must take a different app= roach. What do you suggest?
=0A
=0AThe keyspace configuration is:=0A - name: TestKeyspace
=0A replica_placement_strategy: org.a= pache.cassandra.locator.SimpleStrategy
=0A replication_factor: 3=0A column_families:
=0A - name: TestFamily
=0A = column_type: Super
=0A compare_with: LongType
=0A = compare_subcolumns_with: UTF8Type
=0A keys_cached: 200000=0A rows_cached: 0
=0A key_cache_save_period_in_sec= onds: 3600
=0A row_cache_save_period_in_seconds: 0
=0A = memtable_flush_after_mins: 3600
=0A memtable_throughput_i= n_mb: 80
=0A memtable_operations_in_millions: 0.10
=0A
=0A= the seeds configuration for each node in each machine:
=0Aseeds:
=0A= - A
=0A
=0Aand the exception that is raised is:
=0AException = in thread "main"
=0Ame.prettyprint.hector.api.exceptions.HUnavailableEx= ception:
=0AUnavailableException()
=0A at me.prettyprint.cass= andra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.= java:36)
=0A at me.prettyprint.cassandra.service.KeyspaceService= Impl$1.execute(KeyspaceServiceImpl.java:88)
=0A at me.prettyprin= t.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java= :81)
=0A at me.prettyprint.cassandra.service.Operation.executeAn= dSetResult(FailoverOperator.java:388)
=0A at me.prettyprint.cass= andra.service.FailoverOperator.operateSingleIteration(FailoverOperator.jav= a:194)
=0A at me.prettyprint.cassandra.service.FailoverOperator.= operate(FailoverOperator.java:99)
=0A at me.prettyprint.cassandr= a.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java= :123)
=0A at me.prettyprint.cassandra.service.KeyspaceServiceImp= l.batchMutate(KeyspaceServiceImpl.java:93)
=0A at me.prettyprint= cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.jav= a:99)
=0A at me.prettyprint.cassandra.model.MutatorImpl$2.doInKe= yspace(MutatorImpl.java:142)
=0A at me.prettyprint.cassandra.mod= el.MutatorImpl$2.doInKeyspace(MutatorImpl.java:139)
=0A at me.pr= ettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure= (KeyspaceOperationCallback.java:20)
=0A at me.prettyprint.cassan= dra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:58)
=0A = at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.jav= a:139)
=0A at com.popego.benchmarks.InsertSegmentsHits.batchInse= rt(InsertSegmentsHits.java:154)
=0A at com.popego.benchmarks.Ins= ertSegmentsHits.insertData(InsertSegmentsHits.java:131)
=0A at c= om.popego.benchmarks.InsertSegmentsHits.main(InsertSegmentsHits.java:177)<= br>=0ACaused by: UnavailableException()
=0A at org.apache.cassan= dra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:16633)
=0A= at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(= Cassandra.java:935)
=0A at org.apache.cassandra.thrift.Cassandra= $Client.batch_mutate(Cassandra.java:909)
=0A at me.prettyprint.c= assandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:86= )
=0A ... 15 more
=0A
--Apple-Webmail-86--81bc2e1e-996e-ee7d-adbd-f669d55ec876-- --Apple-Webmail-42--81bc2e1e-996e-ee7d-adbd-f669d55ec876--