Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EF7D6D817 for ; Thu, 12 Jul 2012 20:43:08 +0000 (UTC) Received: (qmail 75038 invoked by uid 500); 12 Jul 2012 20:43:06 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 75007 invoked by uid 500); 12 Jul 2012 20:43:06 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 74998 invoked by uid 99); 12 Jul 2012 20:43:06 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Jul 2012 20:43:06 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of lilyevsky@mooncapital.com designates 38.105.147.185 as permitted sender) Received: from [38.105.147.185] (HELO mcm-exch-hc.MoonCapital.Corp) (38.105.147.185) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Jul 2012 20:42:58 +0000 Received: from mcm-exch-mb.MoonCapital.Corp ([::1]) by mcm-exch-hc.MoonCapital.Corp ([10.5.1.85]) with mapi; Thu, 12 Jul 2012 16:42:31 -0400 From: Leonid Ilyevsky To: "'user@cassandra.apache.org'" Date: Thu, 12 Jul 2012 16:42:30 -0400 Subject: How to speed up data loading Thread-Topic: How to speed up data loading Thread-Index: Ac1KThB3gFdCmA+JRPq6jC0AurKvaAWHoBWQ Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: multipart/alternative; boundary="_000_A663AF0EC22BAB47966CA61D63C01B385606D247mcmexchmbMoonCa_" MIME-Version: 1.0 --_000_A663AF0EC22BAB47966CA61D63C01B385606D247mcmexchmbMoonCa_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable I am loading a large set of data into a CF with composite key. The load is = going pretty slow, hundreds or even thousands times slower than it would do= in RDBMS. I have a choice of how granular my physical key (the first component of the= primary key) is, this way I can balance between smaller rows and too many = keys vs. wide rows and fewer keys. What are the guidelines about this? How = the width of the physical row affects the speed of load? I see that Cassandra is doing a lot of processing behind the scene, even wh= en I kill the client, the server is still consuming a lot of CPU for a long= time. What else should I look at ? Anything in configuration? ________________________________ This email, along with any attachments, is confidential and may be legally = privileged or otherwise protected from disclosure. Any unauthorized dissemi= nation, copying or use of the contents of this email is strictly prohibited= and may be in violation of law. If you are not the intended recipient, any= disclosure, copying, forwarding or distribution of this email is strictly = prohibited and this email and any attachments should be deleted immediately= . This email and any attachments do not constitute an offer to sell or a so= licitation of an offer to purchase any interest in any investment vehicle s= ponsored by Moon Capital Management LP ("Moon Capital"). Moon Capital does = not provide legal, accounting or tax advice. Any statement regarding legal,= accounting or tax matters was not intended or written to be relied upon by= any person as advice. Moon Capital does not waive confidentiality or privi= lege as a result of this email. --_000_A663AF0EC22BAB47966CA61D63C01B385606D247mcmexchmbMoonCa_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

I am loading a large set of data into a CF with composite ke= y. The load is going pretty slow, hundreds or even thousands times slower t= han it would do in RDBMS.

I have a choice of how granular my physical key (the first c= omponent of the primary key) is, this way I can balance between smaller row= s and too many keys vs. wide rows and fewer keys. What are the guidelines about this? How the = width of the physical row affects the speed of load?

 

I see that Cassandra is doing a lot of processing behind the= scene, even when I kill the client, the server is still consuming a lot of= CPU for a long time.

 

What else should I look at ? Anything in configuration?



This email, along with any a= ttachments, is confidential and may be legally privileged or otherwise prot= ected from disclosure. Any unauthorized dissemination, copying or use of th= e contents of this email is strictly prohibited and may be in violation of law. If you are not the intended recipient, any= disclosure, copying, forwarding or distribution of this email is strictly = prohibited and this email and any attachments should be deleted immediately= . This email and any attachments do not constitute an offer to sell or a solicitation of an offer to purcha= se any interest in any investment vehicle sponsored by Moon Capital Managem= ent LP (“Moon Capital”). Moon Capital does not provide legal, a= ccounting or tax advice. Any statement regarding legal, accounting or tax matters was not intended or written to be relied = upon by any person as advice. Moon Capital does not waive confidentiality o= r privilege as a result of this email.
--_000_A663AF0EC22BAB47966CA61D63C01B385606D247mcmexchmbMoonCa_--