Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 889D5F789 for ; Thu, 18 Apr 2013 18:37:42 +0000 (UTC) Received: (qmail 21360 invoked by uid 500); 18 Apr 2013 18:37:40 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 21284 invoked by uid 500); 18 Apr 2013 18:37:39 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 21276 invoked by uid 99); 18 Apr 2013 18:37:39 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 Apr 2013 18:37:39 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a57.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 Apr 2013 18:37:34 +0000 Received: from homiemail-a57.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a57.g.dreamhost.com (Postfix) with ESMTP id 00DBE208073 for ; Thu, 18 Apr 2013 11:37:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=from :content-type:message-id:mime-version:subject:date:references:to :in-reply-to; s=thelastpickle.com; bh=A+VZ8+D9Ka9WdAO9OGfKaYAGDb U=; b=1HKC5+e9Hs8xIOx6GFlzNK3mG9y5bQcT/RmWY3KQXhoys9JI7ojw1v8iWl Co2CH5tghR+OrINx4oxKvMtWvaArBvXCdL+tpt6CqwRCv1YL1FYdJ9ijECnmkpAP sip9U4mQj9FUIn0WoWqJUf+8uFenPGarICpaiRPNBAnqLRCPI= Received: from [172.16.1.8] (unknown [203.86.207.101]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a57.g.dreamhost.com (Postfix) with ESMTPSA id 7812220806D for ; Thu, 18 Apr 2013 11:37:08 -0700 (PDT) From: aaron morton Content-Type: multipart/alternative; boundary="Apple-Mail=_8C146317-9137-4E2E-BEA3-7AB473201BBB" Message-Id: <63A95BEC-7F88-4E13-B9BE-6DEAB132B195@thelastpickle.com> Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: Using map type with composite primary key causes significant performance decrease Date: Fri, 19 Apr 2013 06:37:14 +1200 References: To: user@cassandra.apache.org In-Reply-To: X-Mailer: Apple Mail (2.1499) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_8C146317-9137-4E2E-BEA3-7AB473201BBB Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 > After about 1-2K inserts I get significant performance decrease. A decrease in performance doing what ?=20 Cheers ----------------- Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 19/04/2013, at 4:43 AM, Oleksandr Petrov = wrote: > Hi, >=20 > I'm trying to persist some event data, I've tried to identify the = bottleneck, and it seems to work like that: >=20 > If I create a table with primary key based on (application, = environment, type and emitted_at): >=20 > CREATE TABLE events (application varchar, environment varchar, type = varchar, additional_info map, hostname varchar, = emitted_at timestamp,=20 > PRIMARY KEY (application, environment, type, emitted_at)); >=20 > And insert events via CQL, prepared statements: >=20 > INSERT INTO events (environment, application, hostname, emitted_at, = type, additional_info) VALUES (?, ?, ?, ?, ?, ?); >=20 > Values are: "local" "analytics" "noname" #inst = "2013-04-18T16:37:02.723-00:00" "event_type" {"some" "value"} >=20 > After about 1-2K inserts I get significant performance decrease. >=20 > I've tried using only emitted_at (timestamp) as a primary key, OR = writing additional_info data as a serialized JSON (varchar) instead of = Map. Both scenarios seem to solve the performance degradation. >=20 > I'm using Cassandra 1.2.3 from DataStax repository, running it on = 2-core machine with 2GB Ram. >=20 > What could I do wrong here? What may cause performance issues?..=20 > Thank you >=20 >=20 > --=20 > alex p --Apple-Mail=_8C146317-9137-4E2E-BEA3-7AB473201BBB Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=iso-8859-1
After about 1-2K inserts I = get significant performance decrease.
A decrease in performance doing what ? 

Cheers

http://www.thelastpickle.com

On 19/04/2013, at 4:43 AM, Oleksandr Petrov <oleksandr.petrov@gmail.com&= gt; wrote:

Hi,

I'm = trying to persist some event data, I've tried to identify the = bottleneck, and it seems to work like that:

If I create a table with primary = key based on (application, environment, type and emitted_at):

CREATE TABLE events (application varchar, = environment varchar, type varchar, additional_info map<varchar, = varchar>, hostname varchar, emitted_at = timestamp, 
PRIMARY KEY (application, environment, = type, emitted_at));

And insert events via CQL, = prepared statements:

INSERT= INTO events (environment, application, hostname, emitted_at, type, = additional_info) VALUES (?, ?, ?, ?, ?, ?);

Values are: "local" = "analytics" "noname" #inst "2013-04-18T16:37:02.723-00:00" "event_type" = {"some" "value"}

After about 1-2K inserts I get significant = performance decrease.

I've = tried using only emitted_at (timestamp) as a primary key, OR writing = additional_info data as a serialized JSON (varchar) instead of Map. Both = scenarios seem to solve the performance degradation.

I'm using Cassandra 1.2.3 from = DataStax repository, running it on 2-core machine with 2GB = Ram.

What could I do = wrong here? What may cause performance issues?.. 
Thank you


--
alex p

= --Apple-Mail=_8C146317-9137-4E2E-BEA3-7AB473201BBB--