Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: domain of jhorman@gmail.com designates
 209.85.213.172 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:date:message-id:subject:from:to:content-type;
        b=teimppQJvA3LX9eIIAU9doWam4u7u/39mkSV3AfXFB0CEq7IpZLmO2xF7sQNo46uqJ
         214x32MEvCEH3/nhHrYiKaM4SGXOCI2ewAkwJC2U/ENLQeeY1/fnGkBpGf0QiQuQ0YVA
         /2Z7M01gWjrgIOC3+WaTCTcwAGqw+saNZr0ns=
MIME-Version: 1.0
Date: Fri, 8 Oct 2010 13:36:41 -0400
Message-ID: <AANLkTin-5+A0svzAsjA85=-RWW-DCVcSxTKEbU-zAHQo@mail.gmail.com>
Subject: Cold boot performance problems
From: Jason Horman <jhorman@gmail.com>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=001636283b4421b57304921e7403

--001636283b4421b57304921e7403
Content-Type: text/plain; charset=ISO-8859-1

We are experiencing very slow performance on Amazon EC2 after a cold boot.
10-20 tps. After the cache is primed things are much better, but it would be
nice if users who aren't in cache didn't experience such slow performance.

Before dumping a bunch of config I just had some general questions.

   - We are using uuid keys, 40m of them and the random partitioner. Typical
   access pattern is reading 200-300 keys in a single web request. Are uuid
   keys going to be painful b/c they are so random. Should we be using less
   random keys, maybe with a shard prefix (01-80), and make sure that our
   tokens group user data together on the cluster (via the order preserving
   partitioner)
   - Would the order preserving partitioner be a better option in the sense
   that it would group a single users data to a single set of machines (if we
   added a prefix to the uuid)?
   - Is there any benefit to doing sharding of our own via Keyspaces. 01-80
   keyspaces to split up the data files. (we already have 80 mysql shards we
   are migrating from, so doing this wouldn't be terrible implementation wise)
   - Should a goal be to get the data/index files as small as possible. Is
   there a size at which they become problematic? (Amazon EC2/EBS fyi)
      - Via more servers
      - Via more cassandra instances on the same server
      - Via manual sharding by keyspace
      - Via manual sharding by columnfamily

Thanks,

-- 
-jason horman

--001636283b4421b57304921e7403
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

We are experiencing very slow performance on Amazon EC2 after a cold boot. =
10-20 tps. After the cache is primed things are much better, but it would b=
e nice if users who aren&#39;t in cache didn&#39;t experience such slow per=
formance.<div>
<br></div><div>Before dumping a bunch of config I just had some general que=
stions.<br><ul><li>We are using uuid keys, 40m of them and the random parti=
tioner. Typical access pattern is reading 200-300 keys in a single web requ=
est. Are uuid keys going to be painful b/c they are so random.=A0Should we =
be using less random keys, maybe with a shard prefix (01-80), and make sure=
 that our tokens group user data together on the cluster (via the order pre=
serving partitioner)</li>
<li>Would the order preserving partitioner be a better option in the sense =
that it would group a single users data to a single set of machines (if we =
added a prefix to the uuid)?</li><li>Is there any benefit to doing sharding=
 of our own via Keyspaces. 01-80 keyspaces to split up the data files. (we =
already have 80 mysql shards we are migrating from, so doing this wouldn=
9;t be terrible implementation wise)</li>
<li>Should a goal be to get the data/index files as small as possible. Is t=
here a size at which they become problematic? (Amazon EC2/EBS fyi)</li><ul>=
<li>Via more servers</li><li>Via more cassandra instances on the same serve=
r</li>
<li>Via manual sharding by keyspace</li><li>Via manual sharding by columnfa=
mily</li></ul></ul><div>Thanks,</div><div><br></div>-- <br>-jason horman<br=
><br></div>

--001636283b4421b57304921e7403--