Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 98A41100E9 for ; Thu, 18 Apr 2013 21:48:49 +0000 (UTC) Received: (qmail 35196 invoked by uid 500); 18 Apr 2013 21:48:47 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 35155 invoked by uid 500); 18 Apr 2013 21:48:47 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 35146 invoked by uid 99); 18 Apr 2013 21:48:47 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 Apr 2013 21:48:46 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of oleksandr.petrov@gmail.com designates 209.85.215.45 as permitted sender) Received: from [209.85.215.45] (HELO mail-la0-f45.google.com) (209.85.215.45) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 Apr 2013 21:48:40 +0000 Received: by mail-la0-f45.google.com with SMTP id gw10so3015799lab.18 for ; Thu, 18 Apr 2013 14:48:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=i6V5V7tA0wowVEf1afCw6V7HFXp3ndCo2j8Egt1btDc=; b=X8H8fCf3edOE6N3+dKvV9+AgGBWtV0HJHhSEnT3UAUebR5JeMAs2YkuEPyuOSmsl8w gkzpcEigJNh3aNtcbkPmlzRkq0Y0WTYKmfjPhIQlqgxpaJapMWxYDq+s9tt4Em1kXHN2 VJOzN5mOuoDU7ez5zSFcLSCNbW2gEysBh63QWuDJ915i6Eubomh48RRvgiTa2Se51fkK nkiCq8nWeBZJUl3Bhj0ZauOiYo0uub7SlLrDftQLRtYpdWqIbr+YCHuHda15FgyFtfG7 36eikspA+qi894d0Y31Ch0nAN4fjg5w5U0N2XuePMacjY7q1qkZSz8wqU5YItGr/UZWf JedA== MIME-Version: 1.0 X-Received: by 10.152.111.67 with SMTP id ig3mr6758829lab.41.1366321700461; Thu, 18 Apr 2013 14:48:20 -0700 (PDT) Received: by 10.112.143.165 with HTTP; Thu, 18 Apr 2013 14:48:20 -0700 (PDT) In-Reply-To: <63A95BEC-7F88-4E13-B9BE-6DEAB132B195@thelastpickle.com> References: <63A95BEC-7F88-4E13-B9BE-6DEAB132B195@thelastpickle.com> Date: Thu, 18 Apr 2013 23:48:20 +0200 Message-ID: Subject: Re: Using map type with composite primary key causes significant performance decrease From: Oleksandr Petrov To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=f46d04088f179ae15304daa98f26 X-Virus-Checked: Checked by ClamAV on apache.org --f46d04088f179ae15304daa98f26 Content-Type: text/plain; charset=ISO-8859-1 Write performance decreases. Reads are basically blocked, too. Sometimes I have to wait 3-4 seconds to get a count even though there're only couple of thousand small entries in a table. On Thu, Apr 18, 2013 at 8:37 PM, aaron morton wrote: > After about 1-2K inserts I get significant performance decrease. > > A decrease in performance doing what ? > > Cheers > > ----------------- > Aaron Morton > Freelance Cassandra Consultant > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 19/04/2013, at 4:43 AM, Oleksandr Petrov > wrote: > > Hi, > > I'm trying to persist some event data, I've tried to identify the > bottleneck, and it seems to work like that: > > If I create a table with primary key based on (application, environment, > type and emitted_at): > > CREATE TABLE events (application varchar, environment varchar, type > varchar, additional_info map, hostname varchar, > emitted_at timestamp, > *PRIMARY KEY (application, environment, type, emitted_at));* > > And insert events via CQL, prepared statements: > > INSERT INTO events (environment, application, hostname, emitted_at, type, > additional_info) VALUES (?, ?, ?, ?, ?, ?); > > Values are: "local" "analytics" "noname" #inst "2013-04-18T16:37:02.723-00:00" > "event_type" {"some" "value"} > > After about 1-2K inserts I get significant performance decrease. > > I've tried using only emitted_at (timestamp) as a primary key, OR writing > additional_info data as a serialized JSON (varchar) instead of Map. Both > scenarios seem to solve the performance degradation. > > I'm using Cassandra 1.2.3 from DataStax repository, running it on 2-core > machine with 2GB Ram. > > What could I do wrong here? What may cause performance issues?.. > Thank you > > > -- > alex p > > > -- alex p --f46d04088f179ae15304daa98f26 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Write performance decreases.

Reads are = basically blocked, too. Sometimes I have to wait 3-4 seconds to get a count= even though there're only couple of thousand small entries in a table.=


On Thu, Apr 1= 8, 2013 at 8:37 PM, aaron morton <aaron@thelastpickle.com> wrote:
After about 1-2K inser= ts I get significant performance decrease.
A decrease in performance doing wh= at ?=A0

Cheers
=

-----------------
Aaron Morton
Freelance Cassandra= Consultant
New Zealand


On 19/04/2013, at 4:43 AM, Oleksandr Petrov <oleksandr.petrov@gmail.c= om> wrote:

Hi,
I'm trying to persist some event data, I've tried to= identify the bottleneck, and it seems to work like that:

If I create a table with primary key based on (application, environ= ment, type and emitted_at):

CREATE TABLE events (application varchar, environment v= archar, type varchar, additional_info map<varchar, varchar>, hostname= varchar, emitted_at timestamp,=A0
PRIMARY KEY (application, e= nvironment, type, emitted_at));

And insert events via CQL, prepared statemen= ts:

INSERT INTO events (environment, application, = hostname, emitted_at, type, additional_info) VALUES (?, ?, ?, ?, ?, ?);

Values are: "local" "analytics" &qu= ot;noname" #inst "2013-04-18T16:37:02.723-00:00" "event_type&q= uot; {"some" "value"}

After about 1-2K inserts I get significant performance decre= ase.

I've tried using only emitted_at (timesta= mp) as a primary key, OR writing additional_info data as a serialized JSON = (varchar) instead of Map. Both scenarios seem to solve the performance degr= adation.

I'm using Cassandra 1.2.3 from DataStax repository,= running it on 2-core machine with 2GB Ram.

What c= ould I do wrong here? What may cause performance issues?..=A0
Thank you


--
alex p




--
alex p
--f46d04088f179ae15304daa98f26--