Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 2235F200C4B for ; Mon, 20 Mar 2017 08:11:09 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 20E40160B81; Mon, 20 Mar 2017 07:11:09 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 6C35A160B76 for ; Mon, 20 Mar 2017 08:11:08 +0100 (CET) Received: (qmail 1192 invoked by uid 500); 20 Mar 2017 07:11:07 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 1130 invoked by uid 99); 20 Mar 2017 07:11:06 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 Mar 2017 07:11:06 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 88B7E18049D for ; Mon, 20 Mar 2017 07:11:05 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 5.212 X-Spam-Level: ***** X-Spam-Status: No, score=5.212 tagged_above=-999 required=6.31 tests=[DKIM_ADSP_CUSTOM_MED=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.001, KAM_LAZY_DOMAIN_SECURITY=1, NML_ADSP_CUSTOM_MED=1.2, RDNS_NONE=3, T_FILL_THIS_FORM_SHORT=0.01] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id OB_lu75sK5SY for ; Mon, 20 Mar 2017 07:11:04 +0000 (UTC) Received: from blaine.gmane.org (unknown [195.159.176.226]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 7B94B5FE31 for ; Mon, 20 Mar 2017 07:11:04 +0000 (UTC) Received: from list by blaine.gmane.org with local (Exim 4.84_2) (envelope-from ) id 1cprSo-0003nn-4i for user@cassandra.apache.org; Mon, 20 Mar 2017 08:10:46 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: user@cassandra.apache.org From: Alain Rastoul Subject: Re: How can I scale my read rate? Date: Mon, 20 Mar 2017 08:10:33 +0100 Lines: 44 Message-ID: References: <9DF3C015-A617-4262-B501-E0B44195FB9C@gmail.com> <58cda017.95d9540a.3833e.9256SMTPIN_ADDED_MISSING@mx.google.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@blaine.gmane.org User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 In-Reply-To: archived-at: Mon, 20 Mar 2017 07:11:09 -0000 On 20/03/2017 02:35, S G wrote: > 2) > https://docs.datastax.com/en/developer/java-driver/3.1/manual/statements/prepared/ > tells me to avoid preparing select queries if I expect a change of > columns in my table down the road. The problem is also related to select * which is considered bad practice with most databases... > I did some more testing to see if my client machines were the bottleneck. > For a 6-node Cassandra cluster (each VM having 8-cores), I got 26,000 > reads/sec for all of the following: > 1) Client nodes:1, Threads: 60 > 2) Client nodes:3, Threads: 180 > 3) Client nodes:5, Threads: 300 > 4) Client nodes:10, Threads: 600 > 5) Client nodes:20, Threads: 1200 > > So adding more client nodes or threads to those client nodes is not > having any effect. > I am suspecting Cassandra is simply not allowing me to go any further. > Primary keys for my schema are: > PRIMARY KEY((name, phone), age) > name: text > phone: int > age: int Yes with such a PK data must be spread on the whole cluster (also taking into account the partitioner), strange that the throughput doesn't scale. I guess you also have verified that you select data randomly? May be you could have a look at the system traces to see the query plan for some requests: If you are on a test cluster you can truncate the tables before (truncate system_traces.sessions; and truncate system_traces.events;), run a test then select * from system_traces.events where session_id = xxxx xxx being one of the sessions you pick in trace.sessions. Try to see if you are not always hitting the same nodes. -- best, Alain