Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of simon.reavely@gmail.com
 designates 209.85.212.44 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=references:message-id:from:to:in-reply-to:content-type
         :content-transfer-encoding:mime-version:subject:date:cc:x-mailer;
        b=XpxBD8DJyD7oiR7p8llfFFl6B8yZK3DjiiWG7tEnnrZP6hq3oY1m4v9p9zPaPtSMCb
         KpyOItDjagaJbVVS6D+lReHQqRx+2LHumHobrh4h4hf9kIBBtCVj0r/HbfXiLCRoUwgR
         Vsxqs3Fo5yyfP0fY3aTXHs+f4232nzFuMc6sI=
References: <AANLkTinxXbAvLOmS1EgKQcDwAGqZiIUkHKcCzlPplsH1@mail.gmail.com>
 <AANLkTikipJ3KTip-KXXvL_WhKHADPkCkNp_T13x3tYj0@mail.gmail.com>
 <AANLkTinRp6j7v7gXCh24KMNFpS9Bzrw5i_a6KeG-nhhw@mail.gmail.com>
 <AANLkTimI4H6lfGQbxjoM_vy_VS0Ol2_n5alkhX2rbaC7@mail.gmail.com>
 <AANLkTilfCHNVsJD5KmriecHgPvOtFDHXg966JTEYh4bY@mail.gmail.com>
 <008c01cb09c6$60713e80$2153bb80$@com>
 <AANLkTinJcCRHxIb4wDO5uVQLMqsFFKp4PhEhxMvkfeuJ@mail.gmail.com>
 <009701cb09ca$25c93c20$715bb460$@com> <009f01cb09cb$49707930$dc516b90$@com>
 <007801cb09d1$8ae04700$a0a0d500$@com>
Message-Id: <9D948451-B079-4D00-B3E1-70D6C2E035D4@gmail.com>
From: Simon Reavely <simon.reavely@gmail.com>
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
In-Reply-To: <007801cb09d1$8ae04700$a0a0d500$@com>
Content-Type: multipart/alternative; boundary=Apple-Mail-14-72060199
Content-Transfer-Encoding: 7bit
Mime-Version: 1.0 (iPod Mail 7E18)
Subject: Re: read operation is slow
Date: Fri, 18 Jun 2010 12:44:46 -0400
Cc: "<user@cassandra.apache.org>" <user@cassandra.apache.org>


--Apple-Mail-14-72060199
Content-Type: text/plain;
	charset=utf-8;
	format=flowed;
	delsp=yes
Content-Transfer-Encoding: quoted-printable

Would it perhaps be worth denormalising your data so that you can =20
retrieve all rows as a single row using a key encoded with the query =20
predicate?

Until we get a stored proc feature (dunno if planned) it's hard to =20
avoid round trips without denormalizing/replication of data to fit =20
your query paths


Simon Reavely


On Jun 11, 2010, at 9:49 PM, "caribbean410" <caribbean410@gmail.com> =20
wrote:

> Thanks for the suggestion. For the test case, it is 1 key and 1 =20
> column. I once changed 10 to 1, as I remember there is no much =20
> difference.
>
>
>
> I have 200k keys and each key is randomly generated. I will try the =20=

> optimized query next week. But maybe you still have to face the case =20=

> that each time a client just wants to query one key from db.
>
>
>
> From: Dop Sun [mailto:sunht@dopsun.com]
> Sent: Friday, June 11, 2010 6:05 PM
> To: user@cassandra.apache.org
> Subject: RE: read operation is slow
>
>
>
> And also, you are only select 1 key and 10 columns?
>
>
>
> criteria.keyList(Lists.newArrayList(userName)).columnRange=20
> (nameFirst, nameFirst, 10);
>
>
>
> Then, if you have 200k keys, you have 200k Thrift calls.  If this is =20=

> the case, you may need to optimize the way you do the query (to =20
> combine multiple keys into a single query), and to reduce the number =20=

> of calls.
>
>
>
> From: Dop Sun [mailto:sunht@dopsun.com]
> Sent: Saturday, June 12, 2010 8:57 AM
> To: user@cassandra.apache.org
> Subject: RE: read operation is slow
>
>
>
> You mean after you =E2=80=9CI remove some unnecessary column family =
and chan=20
> ge the size of rowcache and keycache, now the latency changes from 0=20=

> .25ms to 0.09ms. In essence 0.09ms*200k=3D18s.=E2=80=9D, it still =
takes 400 se=20
> conds to returning?
>
>
>
> From: Caribbean410 [mailto:caribbean410@gmail.com]
> Sent: Saturday, June 12, 2010 8:48 AM
> To: user@cassandra.apache.org
> Subject: Re: read operation is slow
>
>
>
> Hi, do you mean this one should not introduce much extra delay? To =20
> read a record, I need select here, not sure where the extra delay =20
> comes from.
>
> On Fri, Jun 11, 2010 at 5:29 PM, Dop Sun <sunht@dopsun.com> wrote:
>
> Jassandra is used here:
>
>
>
> Map<String, List<IColumn>> map =3D criteria.select();
>
>
>
> The select here basically is a call to Thrift API: get_range_slices
>
>
>
>
>
> From: Caribbean410 [mailto:caribbean410@gmail.com]
> Sent: Saturday, June 12, 2010 8:00 AM
>
>
> To: user@cassandra.apache.org
> Subject: Re: read operation is slow
>
>
>
> I remove some unnecessary column family and change the size of =20
> rowcache and keycache, now the latency changes from 0.25ms to =20
> 0.09ms. In essence 0.09ms*200k=3D18s. I don't know why it takes more =20=

> than 400s total. Here is the client code and cfstats. There are not =20=

> many operations here, why is the extra time so large?
>
>
>
>               long start =3D System.currentTimeMillis();
>               for (int j =3D 0; j < 1; j++) {
>                   for (int i =3D 0; i < numOfRecords; i++) {
>                       int n =3D random.nextInt(numOfRecords);
>                       ICriteria criteria =3D cf.createCriteria();
>                       userName =3D keySet[n];
>                       criteria.keyList(Lists.newArrayList=20
> (userName)).columnRange(nameFirst, nameFirst, 10);
>                       Map<String, List<IColumn>> map =3D =20
> criteria.select();
>                       List<IColumn> list =3D map.get(userName);
> //                      ByteArray bloc =3D list.get(0).getValue();
> //                      byte[] byteArrayloc =3D bloc.toByteArray();
> //                      loc =3D new String(byteArrayloc);
> //                      readBytes =3D readBytes + loc.length();
>                       readBytes =3D readBytes + blobSize;
>                   }
>               }
>
>             long finish=3DSystem.currentTimeMillis();
>
>             float totalTime=3D(finish-start)/1000;
>
>
> Keyspace: Keyspace1
>     Read Count: 600000
>     Read Latency: 0.09053006666666667 ms.
>     Write Count: 200000
>     Write Latency: 0.01504989 ms.
>     Pending Tasks: 0
>         Column Family: Standard2
>         SSTable count: 3
>         Space used (live): 265990358
>         Space used (total): 265990358
>         Memtable Columns Count: 2615
>         Memtable Data Size: 2667300
>         Memtable Switch Count: 3
>         Read Count: 600000
>         Read Latency: 0.091 ms.
>         Write Count: 200000
>         Write Latency: 0.015 ms.
>         Pending Tasks: 0
>         Key cache capacity: 10000000
>         Key cache size: 187465
>         Key cache hit rate: 0.0
>         Row cache capacity: 10000000
>         Row cache size: 189990
>         Row cache hit rate: 0.68335
>         Compacted row minimum size: 0
>         Compacted row maximum size: 0
>         Compacted row mean size: 0
>
> ----------------
> Keyspace: system
>     Read Count: 1
>     Read Latency: 10.954 ms.
>     Write Count: 4
>     Write Latency: 0.28075 ms.
>     Pending Tasks: 0
>         Column Family: HintsColumnFamily
>         SSTable count: 0
>         Space used (live): 0
>         Space used (total): 0
>         Memtable Columns Count: 0
>         Memtable Data Size: 0
>         Memtable Switch Count: 0
>         Read Count: 0
>         Read Latency: NaN ms.
>         Write Count: 0
>         Write Latency: NaN ms.
>         Pending Tasks: 0
>         Key cache capacity: 1
>         Key cache size: 0
>         Key cache hit rate: NaN
>         Row cache: disabled
>         Compacted row minimum size: 0
>         Compacted row maximum size: 0
>         Compacted row mean size: 0
>
>         Column Family: LocationInfo
>         SSTable count: 2
>         Space used (live): 3232
>         Space used (total): 3232
>         Memtable Columns Count: 2
>         Memtable Data Size: 46
>         Memtable Switch Count: 1
>         Read Count: 1
>         Read Latency: 10.954 ms.
>         Write Count: 4
>         Write Latency: 0.281 ms.
>         Pending Tasks: 0
>         Key cache capacity: 1
>         Key cache size: 1
>         Key cache hit rate: 0.0
>         Row cache: disabled
>         Compacted row minimum size: 0
>         Compacted row maximum size: 0
>         Compacted row mean size: 0
>
> ----------------
>
> On Fri, Jun 11, 2010 at 1:50 PM, Jonathan Ellis <jbellis@gmail.com> =20=

> wrote:
>
> you need to look at cfstats to see what the latency is internal to
> cassandra, vs what your client is introducing
>
> then you should probably read the comments in the configuration file
> about caching
>
>
> On Fri, Jun 11, 2010 at 9:38 AM, Caribbean410 =20
> <caribbean410@gmail.com> wrote:
> >
> > Thanks Riyad.
> >
> > Right now I am just testing Cassandra on single node. The server =20
> and client
> > are running on the same machine. I tried the read test again on two
> > machines, on one machine the cpu usage is around 30% most of the =20
> time and
> > another is 90%.
> >
> > Pelops is one way to access Cassandra, there are also other java =20
> client like
> > hector and jassandra, will these java clients have significant =20
> different
> > performance?
> >
> > Also I once tried to change the storage configure file, like change
> > CommitLogDirectory and DataFileDirectory to different disks, change
> > DiskAccessMode to mmap for a 64bit machine, and change =20
> ConcurrentReads from
> > 8 to 2. All of these do not change performance much.
> >
> > For other users who use different access client, like using php, c+=20=

> +,
> > python, etc, if you have any experience in boosting the read =20
> performance,
> > you are more than welcome to share with me. Thanks,
> >
> > On Fri, Jun 11, 2010 at 8:19 AM, Riyad Kalla <rkalla@gmail.com> =20
> wrote:
> >>
> >> Caribbean410,
> >>
> >> This comes up on the Redis list alot as well -- what you are =20
> actually
> >> measuring is the client sending a network connection to the Cas =20
> server and
> >> it replying -- so the performance numbers you are getting can =20
> easily be 70%
> >> network wait time and not necessarily hardcore read/write server
> >> performance.
> >> One way to see if this is the case, run your read test, then =20
> watch the CPU
> >> on the server for the Cassandra process and see if it's pegging =20
> the CPU --
> >> if it's just sitting there banging between 0-10%, the you are =20
> spending most
> >> of your time waiting on network i/o (open/close sockets, etc.)
> >> If you can parallelize your test to spawn say 5 threads that all =20=

> do the
> >> same thing, see if the performance for each thread increases =20
> linearly --
> >> which would indicate Cassandra is plenty fast in your setup, you =20=

> just need
> >> to utilize more client threads over the network.
> >> That new Java library, Pelops by Dominic
> >> =
(http://ria101.wordpress.com/2010/06/11/pelops-the-beautiful-cassandra-dat=
abase-client-for-java/=20
> )
> >> has a nice intrinsic node-balancing design that could be handy IF =20=

> you are
> >> using multiple nodes. If you are just testing against 1 node, =20
> then spawn
> >> multiple threads of your code above and see how each thread's =20
> performance
> >> scales.
> >> -R
> >> On Thu, Jun 10, 2010 at 2:39 PM, Caribbean410 =
<caribbean410@gmail.com=20
> >
> >> wrote:
> >>>
> >>> Hello,
> >>>
> >>> I am testing the performance of cassandra. We write 200k records =20=

> to
> >>> database and each record is 1k size. Then we read these 200k =20
> records.
> >>> It takes more than 400s to finish the read which is much slower =20=

> than
> >>> mysql (20s around). I read some discussion online and someone =20
> suggest
> >>> to make multiple connections to make it faster. But I am not =20
> sure how
> >>> to do it, do I need to change my storage setting file or just =20
> change
> >>> the java client code?
> >>>
> >>> Here is my read code,
> >>>
> >>>                     Properties info =3D new Properties();
> >>>                     info.put(DriverManager.CONSISTENCY_LEVEL,
> >>>                               ConsistencyLevel.ONE.toString());
> >>>
> >>>                     IConnection connection =3D =20
> DriverManager.getConnection(
> >>>                                 "thrift://localhost:9160", info);
> >>>
> >>>                       // 2. Get a KeySpace by name
> >>>                       IKeySpace keySpace =3D
> >>> connection.getKeySpace("Keyspace1");
> >>>
> >>>                       // 3. Get a ColumnFamily by name
> >>>                       IColumnFamily cf =3D
> >>> keySpace.getColumnFamily("Standard2");
> >>>
> >>>                       ByteArray nameFirst =3D ByteArray.ofASCII=20
> ("first");
> >>>                       ICriteria criteria =3D cf.createCriteria();
> >>>                       long readBytes =3D 0;
> >>>                       long start =3D System.currentTimeMillis();
> >>>                           for (int i =3D 0; i < numOfRecords; i++) =
{
> >>>                                   int n =3D random.nextInt=20
> (numOfRecords);
> >>>                                       userName =3D keySet[n];
> >>>
> >>> criteria.keyList(Lists.newArrayList(userName)).columnRange=20
> (nameFirst,
> >>> nameFirst, 10);
> >>>                                       Map<String, List<IColumn>> =20=

> map =3D
> >>> criteria.select();
> >>>                                       List<IColumn> list =3D
> >>> map.get(userName);
> >>>                                       ByteArray bloc =3D
> >>> list.get(0).getValue();
> >>>                                       byte[] byteArrayloc =3D
> >>> bloc.toByteArray();
> >>>                                       loc =3D new String=20
> (byteArrayloc);
> >>> //                                    System.out.println(userName=20=

> +"
> >>> "+loc);
> >>>                                       readBytes =3D readBytes +
> >>> loc.length();
> >>>                           }
> >>>
> >>>                         long finish=3DSystem.currentTimeMillis();
> >>>
> >>> I once commented these lines
> >>>
> >>>                                       ByteArray bloc =3D
> >>> list.get(0).getValue();
> >>>                                       byte[] byteArrayloc =3D
> >>> bloc.toByteArray();
> >>>                                       loc =3D new String=20
> (byteArrayloc);
> >>> //                                    System.out.println(userName=20=

> +"
> >>> "+loc);
> >>>                                       readBytes =3D readBytes +
> >>> loc.length();
> >>>
> >>> And the performance doesn't improve much.
> >>>
> >>> Any suggestion is welcome. Thanks,
> >
> >
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>
>
>
>

--Apple-Mail-14-72060199
Content-Type: text/html;
	charset=utf-8
Content-Transfer-Encoding: quoted-printable

<html><body bgcolor=3D"#FFFFFF"><div>Would it perhaps be worth =
denormalising your data so that you can retrieve all rows as a single =
row using a key encoded with the query =
predicate?</div><div><br></div><div>Until we get a stored proc feature =
(dunno if planned) it's hard to avoid round trips without =
denormalizing/replication of data to fit your query =
paths</div><div>&nbsp;<br><br>Simon =
Reavely<div><br></div></div><div><br>On Jun 11, 2010, at 9:49 PM, =
"caribbean410" &lt;<a =
href=3D"mailto:caribbean410@gmail.com">caribbean410@gmail.com</a>&gt; =
wrote:<br><br></div><div></div><blockquote type=3D"cite"><div>

<div class=3D"WordSection1">

<p class=3D"MsoNormal"><span =
style=3D"font-size:10.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&q=
uot;">Thanks
for the suggestion. For the test case, it is 1 key and 1 column. I once =
changed
10 to 1, as I remember there is no much =
difference.<o:p></o:p></span></p>

<p class=3D"MsoNormal"><span =
style=3D"font-size:10.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&q=
uot;"><o:p>&nbsp;</o:p></span></p>

<p class=3D"MsoNormal"><span =
style=3D"font-size:10.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&q=
uot;">I
have 200k keys and each key is randomly generated. I will try the =
optimized
query next week. But maybe you still have to face the case that each =
time a
client just wants to query one key from db.<o:p></o:p></span></p>

<p class=3D"MsoNormal"><span =
style=3D"font-size:10.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&q=
uot;"><o:p>&nbsp;</o:p></span></p>

<div>

<div style=3D"border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt =
0in 0in 0in">

<p class=3D"MsoNormal"><b><span =
style=3D"font-size:10.0pt;font-family:&quot;Tahoma&quot;,&quot;sans-serif&=
quot;">From:</span></b><span =
style=3D"font-size:10.0pt;font-family:&quot;Tahoma&quot;,&quot;sans-serif&=
quot;"> Dop Sun
[mailto:sunht@dopsun.com] <br>
<b>Sent:</b> Friday, June 11, 2010 6:05 PM<br>
<b>To:</b> <a href=3D"mailto:user@cassandra.apache.org"><a =
href=3D"mailto:user@cassandra.apache.org">user@cassandra.apache.org</a></a=
><br>
<b>Subject:</b> RE: read operation is slow<o:p></o:p></span></p>

</div>

</div>

<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>

<p class=3D"MsoNormal"><span =
style=3D"font-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif=
&quot;;
color:#1F497D">And also, you are only select <b><u>1</u></b> key and =
<b><u>10</u></b>
columns?<o:p></o:p></span></p>

<p class=3D"MsoNormal"><span =
style=3D"font-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif=
&quot;;
color:#1F497D"><o:p>&nbsp;</o:p></span></p>

<p class=3D"MsoNormal"><span =
style=3D"font-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif=
&quot;;
=
color:#1F497D">criteria.keyList(Lists.newArrayList(userName)).columnRange(=
nameFirst,
nameFirst, 10);<o:p></o:p></span></p>

<p class=3D"MsoNormal"><span =
style=3D"font-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif=
&quot;;
color:#1F497D"><o:p>&nbsp;</o:p></span></p>

<p class=3D"MsoNormal"><span =
style=3D"font-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif=
&quot;;
color:#1F497D">Then, if you have 200k keys, you have 200k Thrift calls.
&nbsp;If this is the case, you may need to optimize the way you do the =
query
(to combine multiple keys into a single query), and to reduce the number =
of
calls.<o:p></o:p></span></p>

<p class=3D"MsoNormal"><span =
style=3D"font-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif=
&quot;;
color:#1F497D"><o:p>&nbsp;</o:p></span></p>

<div>

<div style=3D"border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt =
0in 0in 0in">

<p class=3D"MsoNormal"><b><span =
style=3D"font-size:10.0pt;font-family:&quot;Tahoma&quot;,&quot;sans-serif&=
quot;">From:</span></b><span =
style=3D"font-size:10.0pt;font-family:&quot;Tahoma&quot;,&quot;sans-serif&=
quot;"> Dop Sun
[mailto:sunht@dopsun.com] <br>
<b>Sent:</b> Saturday, June 12, 2010 8:57 AM<br>
<b>To:</b> <a href=3D"mailto:user@cassandra.apache.org"><a =
href=3D"mailto:user@cassandra.apache.org">user@cassandra.apache.org</a></a=
><br>
<b>Subject:</b> RE: read operation is slow<o:p></o:p></span></p>

</div>

</div>

<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>

<p class=3D"MsoNormal"><span =
style=3D"font-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif=
&quot;;
color:#1F497D">You mean after you =E2=80=9CI remove some unnecessary =
column family and
change the size of rowcache and keycache, now the latency changes from =
0.25ms
to 0.09ms. In essence 0.09ms*200k=3D18s.=E2=80=9D, it still takes 400 =
seconds to returning?<o:p></o:p></span></p>

<p class=3D"MsoNormal"><span =
style=3D"font-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif=
&quot;;
color:#1F497D"><o:p>&nbsp;</o:p></span></p>

<div style=3D"border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt =
0in 0in 0in">

<p class=3D"MsoNormal"><b><span =
style=3D"font-size:10.0pt;font-family:&quot;Tahoma&quot;,&quot;sans-serif&=
quot;">From:</span></b><span =
style=3D"font-size:10.0pt;font-family:&quot;Tahoma&quot;,&quot;sans-serif&=
quot;"> Caribbean410
[mailto:caribbean410@gmail.com] <br>
<b>Sent:</b> Saturday, June 12, 2010 8:48 AM<br>
<b>To:</b> <a href=3D"mailto:user@cassandra.apache.org"><a =
href=3D"mailto:user@cassandra.apache.org">user@cassandra.apache.org</a></a=
><br>
<b>Subject:</b> Re: read operation is slow<o:p></o:p></span></p>

</div>

<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>

<p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt">Hi, do you mean =
this one should
not introduce much extra delay? To read a record, I need select here, =
not sure
where the extra delay comes from.<o:p></o:p></p>

<div>

<p class=3D"MsoNormal">On Fri, Jun 11, 2010 at 5:29 PM, Dop Sun &lt;<a =
href=3D"mailto:sunht@dopsun.com"><a =
href=3D"mailto:sunht@dopsun.com">sunht@dopsun.com</a></a>&gt; =
wrote:<o:p></o:p></p>

<div>

<div>

<p class=3D"MsoNormal" =
style=3D"mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span =
style=3D"font-size:11.0pt;color:#1F497D">Jassandra is used =
here:</span><o:p></o:p></p>

<div>

<p class=3D"MsoNormal" =
style=3D"mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span =
style=3D"font-size:11.0pt;color:#1F497D">&nbsp;</span><o:p></o:p></p>

<p class=3D"MsoNormal" =
style=3D"mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span =
style=3D"font-size:11.0pt;font-family:&quot;Courier =
New&quot;;color:#1F497D">Map&lt;String,
List&lt;IColumn&gt;&gt; map =3D criteria.select();</span><o:p></o:p></p>

<p class=3D"MsoNormal" =
style=3D"mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span =
style=3D"font-size:11.0pt;color:#1F497D">&nbsp;</span><o:p></o:p></p>

</div>

<p class=3D"MsoNormal" =
style=3D"mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span =
style=3D"font-size:11.0pt;color:#1F497D">The select here basically is a =
call to
Thrift API: </span><span =
style=3D"font-size:11.0pt;font-family:&quot;Courier New&quot;;
color:#1F497D">get_range_slices</span><o:p></o:p></p>

<p class=3D"MsoNormal" =
style=3D"mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span =
style=3D"font-size:11.0pt;color:#1F497D">&nbsp;</span><o:p></o:p></p>

<p class=3D"MsoNormal" =
style=3D"mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span =
style=3D"font-size:11.0pt;color:#1F497D">&nbsp;</span><o:p></o:p></p>

<div style=3D"border:none;border-top:solid windowtext =
1.0pt;padding:3.0pt 0in 0in 0in;
border-color:-moz-use-text-color -moz-use-text-color">

<p class=3D"MsoNormal" =
style=3D"mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><b><span =
style=3D"font-size:10.0pt">From:</span></b><span =
style=3D"font-size:10.0pt">
Caribbean410 [mailto:<a href=3D"mailto:caribbean410@gmail.com" =
target=3D"_blank"><a =
href=3D"mailto:caribbean410@gmail.com">caribbean410@gmail.com</a></a>]
<br>
<b>Sent:</b> Saturday, June 12, 2010 8:00 AM<o:p></o:p></span></p>

<div>

<p class=3D"MsoNormal"><span style=3D"font-size:10.0pt"><br>
<b>To:</b> <a href=3D"mailto:user@cassandra.apache.org" =
target=3D"_blank"><a =
href=3D"mailto:user@cassandra.apache.org">user@cassandra.apache.org</a></a=
><br>
<b>Subject:</b> Re: read operation is slow<o:p></o:p></span></p>

</div>

</div>

<p class=3D"MsoNormal" =
style=3D"mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">&nbsp;<o:p></=
o:p></p>

<p class=3D"MsoNormal" =
style=3D"mso-margin-top-alt:auto;margin-bottom:12.0pt">I
remove some unnecessary column family and change the size of rowcache =
and
keycache, now the latency changes from 0.25ms to 0.09ms. In essence =
0.09ms*200k=3D18s.
I don't know why it takes more than 400s total. Here is the client code =
and
cfstats. There are not many operations here, why is the extra time so =
large?<o:p></o:p></p>

<div>

<div>

<p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt"><br>
<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
long start
=3D System.currentTimeMillis();<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; for =
(int j
=3D 0; j &lt; 1; j++) {<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; =
&nbsp;&nbsp;&nbsp;
&nbsp; for (int i =3D 0; i &lt; numOfRecords; i++) {<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; =
&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp; &nbsp; int n =3D random.nextInt(numOfRecords);<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; =
&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ICriteria criteria =3D =
cf.createCriteria();<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; =
&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; userName =3D keySet[n];<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; =
&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
criteria.keyList(Lists.newArrayList(userName)).columnRange(nameFirst,
nameFirst, 10); &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; =
&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; =
&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Map&lt;String, List&lt;IColumn&gt;&gt; =
map =3D
criteria.select(); <br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; =
&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; List&lt;IColumn&gt; list =3D =
map.get(userName); <br>
//&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; =
&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ByteArray bloc =3D =
list.get(0).getValue();<br>
//&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; =
&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; byte[] byteArrayloc =3D =
bloc.toByteArray();<br>
//&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; =
&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; loc =3D new
String(byteArrayloc);&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; =
&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <br>
//&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; =
&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; readBytes =3D readBytes + =
loc.length();<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; =
&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; readBytes =3D readBytes + blobSize;<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; =
&nbsp;&nbsp;&nbsp;
&nbsp; }<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
}<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; long
finish=3DSystem.currentTimeMillis();<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; float =
totalTime=3D(finish-start)/1000;<br>
<br>
<br>
Keyspace: Keyspace1<br>
&nbsp;&nbsp;&nbsp; Read Count: 600000<br>
&nbsp;&nbsp;&nbsp; Read Latency: 0.09053006666666667 ms.<br>
&nbsp;&nbsp;&nbsp; Write Count: 200000<br>
&nbsp;&nbsp;&nbsp; Write Latency: 0.01504989 ms.<br>
&nbsp;&nbsp;&nbsp; Pending Tasks: 0<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Column Family: Standard2<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; SSTable count: 3<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Space used (live): 265990358<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Space used (total): 265990358<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Memtable Columns Count: 2615<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Memtable Data Size: 2667300<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Memtable Switch Count: 3<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Read Count: 600000<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Read Latency: 0.091 ms.<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Write Count: 200000<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Write Latency: 0.015 ms.<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Pending Tasks: 0<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Key cache capacity: 10000000<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Key cache size: 187465<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Key cache hit rate: 0.0<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Row cache capacity: 10000000<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Row cache size: 189990<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Row cache hit rate: 0.68335<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Compacted row minimum size: 0<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Compacted row maximum size: 0<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Compacted row mean size: 0<br>
<br>
----------------<br>
Keyspace: system<br>
&nbsp;&nbsp;&nbsp; Read Count: 1<br>
&nbsp;&nbsp;&nbsp; Read Latency: 10.954 ms.<br>
&nbsp;&nbsp;&nbsp; Write Count: 4<br>
&nbsp;&nbsp;&nbsp; Write Latency: 0.28075 ms.<br>
&nbsp;&nbsp;&nbsp; Pending Tasks: 0<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Column Family: =
HintsColumnFamily<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; SSTable count: 0<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Space used (live): 0<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Space used (total): 0<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Memtable Columns Count: 0<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Memtable Data Size: 0<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Memtable Switch Count: 0<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Read Count: 0<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Read Latency: NaN ms.<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Write Count: 0<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Write Latency: NaN ms.<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Pending Tasks: 0<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Key cache capacity: 1<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Key cache size: 0<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Key cache hit rate: NaN<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Row cache: disabled<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Compacted row minimum size: 0<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Compacted row maximum size: 0<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Compacted row mean size: 0<br>
<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Column Family: LocationInfo<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; SSTable count: 2<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Space used (live): 3232<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Space used (total): 3232<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Memtable Columns Count: 2<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Memtable Data Size: 46<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Memtable Switch Count: 1<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Read Count: 1<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Read Latency: 10.954 ms.<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Write Count: 4<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Write Latency: 0.281 ms.<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Pending Tasks: 0<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Key cache capacity: 1<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Key cache size: 1<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Key cache hit rate: 0.0<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Row cache: disabled<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Compacted row minimum size: 0<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Compacted row maximum size: 0<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Compacted row mean size: 0<br>
<br>
----------------<o:p></o:p></p>

</div>

</div>

<div>

<div>

<div>

<p class=3D"MsoNormal" =
style=3D"mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">On
Fri, Jun 11, 2010 at 1:50 PM, Jonathan Ellis &lt;<a =
href=3D"mailto:jbellis@gmail.com" target=3D"_blank"><a =
href=3D"mailto:jbellis@gmail.com">jbellis@gmail.com</a></a>&gt;
wrote:<o:p></o:p></p>

<p class=3D"MsoNormal" =
style=3D"mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">you
need to look at cfstats to see what the latency is internal to<br>
cassandra, vs what your client is introducing<br>
<br>
then you should probably read the comments in the configuration file<br>
about caching<o:p></o:p></p>

<div>

<div>

<p class=3D"MsoNormal" =
style=3D"mso-margin-top-alt:auto;margin-bottom:12.0pt"><br>
On Fri, Jun 11, 2010 at 9:38 AM, Caribbean410 &lt;<a =
href=3D"mailto:caribbean410@gmail.com" target=3D"_blank"><a =
href=3D"mailto:caribbean410@gmail.com">caribbean410@gmail.com</a></a>&gt;
wrote:<br>
&gt;<br>
&gt; Thanks Riyad.<br>
&gt;<br>
&gt; Right now I am just testing Cassandra on single node. The server =
and
client<br>
&gt; are running on the same machine. I tried the read test again on =
two<br>
&gt; machines, on one machine the cpu usage is around 30% most of the =
time and<br>
&gt; another is 90%.<br>
&gt;<br>
&gt; Pelops is one way to access Cassandra, there are also other java =
client
like<br>
&gt; hector and jassandra, will these java clients have significant =
different<br>
&gt; performance?<br>
&gt;<br>
&gt; Also I once tried to change the storage configure file, like =
change<br>
&gt; CommitLogDirectory and DataFileDirectory to different disks, =
change<br>
&gt; DiskAccessMode to mmap for a 64bit machine, and change =
ConcurrentReads
from<br>
&gt; 8 to 2. All of these do not change performance much.<br>
&gt;<br>
&gt; For other users who use different access client, like using php, =
c++,<br>
&gt; python, etc, if you have any experience in boosting the read =
performance,<br>
&gt; you are more than welcome to share with me. Thanks,<br>
&gt;<br>
&gt; On Fri, Jun 11, 2010 at 8:19 AM, Riyad Kalla &lt;<a =
href=3D"mailto:rkalla@gmail.com" target=3D"_blank"><a =
href=3D"mailto:rkalla@gmail.com">rkalla@gmail.com</a></a>&gt; wrote:<br>
&gt;&gt;<br>
&gt;&gt; Caribbean410,<br>
&gt;&gt;<br>
&gt;&gt; This comes up on the Redis list alot as well -- what you are =
actually<br>
&gt;&gt; measuring is the client sending a network connection to the Cas =
server
and<br>
&gt;&gt; it replying -- so the performance numbers you are getting can =
easily
be 70%<br>
&gt;&gt; network wait time and not necessarily hardcore read/write =
server<br>
&gt;&gt; performance.<br>
&gt;&gt; One way to see if this is the case, run your read test, then =
watch the
CPU<br>
&gt;&gt; on the server for the Cassandra process and see if it's pegging =
the
CPU --<br>
&gt;&gt; if it's just sitting there banging between 0-10%, the you are =
spending
most<br>
&gt;&gt; of your time waiting on network i/o (open/close sockets, =
etc.)<br>
&gt;&gt; If you can parallelize your test to spawn say 5 threads that =
all do
the<br>
&gt;&gt; same thing, see if the performance for each thread
increases&nbsp;linearly --<br>
&gt;&gt; which would indicate Cassandra is plenty fast in your setup, =
you just
need<br>
&gt;&gt; to utilize more client threads over the network.<br>
&gt;&gt; That new Java library, Pelops by Dominic<br>
&gt;&gt; (<a =
href=3D"http://ria101.wordpress.com/2010/06/11/pelops-the-beautiful-cassan=
dra-database-client-for-java/" target=3D"_blank"><a =
href=3D"http://ria101.wordpress.com/2010/06/11/pelops-the-beautiful-cassan=
dra-database-client-for-java/">http://ria101.wordpress.com/2010/06/11/pelo=
ps-the-beautiful-cassandra-database-client-for-java/</a></a>)<br>
&gt;&gt; has a nice intrinsic node-balancing design that could be handy =
IF you
are<br>
&gt;&gt; using multiple nodes. If you are just testing against 1 node, =
then
spawn<br>
&gt;&gt; multiple threads of your code above and see how each thread's
performance<br>
&gt;&gt; scales.<br>
&gt;&gt; -R<br>
&gt;&gt; On Thu, Jun 10, 2010 at 2:39 PM, Caribbean410 &lt;<a =
href=3D"mailto:caribbean410@gmail.com" target=3D"_blank"><a =
href=3D"mailto:caribbean410@gmail.com">caribbean410@gmail.com</a></a>&gt;<=
br>
&gt;&gt; wrote:<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; Hello,<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; I am testing the performance of cassandra. We write 200k =
records
to<br>
&gt;&gt;&gt; database and each record is 1k size. Then we read these =
200k
records.<br>
&gt;&gt;&gt; It takes more than 400s to finish the read which is much =
slower
than<br>
&gt;&gt;&gt; mysql (20s around). I read some discussion online and =
someone
suggest<br>
&gt;&gt;&gt; to make multiple connections to make it faster. But I am =
not sure
how<br>
&gt;&gt;&gt; to do it, do I need to change my storage setting file or =
just
change<br>
&gt;&gt;&gt; the java client code?<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; Here is my read code,<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; Properties info =3D new Properties();<br>
&gt;&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; info.put(DriverManager.CONSISTENCY_LEVEL,<br>
&gt;&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
ConsistencyLevel.ONE.toString());<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; IConnection connection =3D DriverManager.getConnection(<br>
&gt;&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
"thrift://localhost:9160",
info);<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; &nbsp; // 2. Get a KeySpace by name<br>
&gt;&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; &nbsp; IKeySpace keySpace =3D<br>
&gt;&gt;&gt; connection.getKeySpace("Keyspace1");<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; &nbsp; // 3. Get a ColumnFamily by name<br>
&gt;&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; &nbsp; IColumnFamily cf =3D<br>
&gt;&gt;&gt; keySpace.getColumnFamily("Standard2");<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; &nbsp; ByteArray nameFirst =3D ByteArray.ofASCII("first");<br>
&gt;&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; &nbsp; ICriteria criteria =3D cf.createCriteria();<br>
&gt;&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; &nbsp; long readBytes =3D 0;<br>
&gt;&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; &nbsp; long start =3D System.currentTimeMillis();<br>
&gt;&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; for (int i =3D 0; i &lt; numOfRecords; i++) =
{<br>
&gt;&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; int n =3D
random.nextInt(numOfRecords);<br>
&gt;&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
userName
=3D keySet[n];<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt;
=
criteria.keyList(Lists.newArrayList(userName)).columnRange(nameFirst,<br>
&gt;&gt;&gt; nameFirst, 10);<br>
&gt;&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
Map&lt;String, List&lt;IColumn&gt;&gt; map =3D<br>
&gt;&gt;&gt; criteria.select();<br>
&gt;&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
List&lt;IColumn&gt; list =3D<br>
&gt;&gt;&gt; map.get(userName);<br>
&gt;&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
ByteArray
bloc =3D<br>
&gt;&gt;&gt; list.get(0).getValue();<br>
&gt;&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
byte[]
byteArrayloc =3D<br>
&gt;&gt;&gt; bloc.toByteArray();<br>
&gt;&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
loc =3D new
String(byteArrayloc);<br>
&gt;&gt;&gt; //&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
System.out.println(userName+"<br>
&gt;&gt;&gt; "+loc);<br>
&gt;&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
readBytes
=3D readBytes +<br>
&gt;&gt;&gt; loc.length();<br>
&gt;&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; }<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; &nbsp; &nbsp; long finish=3DSystem.currentTimeMillis();<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; I once commented these lines<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
ByteArray
bloc =3D<br>
&gt;&gt;&gt; list.get(0).getValue();<br>
&gt;&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
byte[]
byteArrayloc =3D<br>
&gt;&gt;&gt; bloc.toByteArray();<br>
&gt;&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
loc =3D new
String(byteArrayloc);<br>
&gt;&gt;&gt; //&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
System.out.println(userName+"<br>
&gt;&gt;&gt; "+loc);<br>
&gt;&gt;&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
readBytes
=3D readBytes +<br>
&gt;&gt;&gt; loc.length();<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; And the performance doesn't improve much.<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; Any suggestion is welcome. Thanks,<br>
&gt;<br>
&gt;<o:p></o:p></p>

</div>

</div>

<p class=3D"MsoNormal" =
style=3D"mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span =
style=3D"color:#888888">--<br>
Jonathan Ellis<br>
Project Chair, Apache Cassandra<br>
co-founder of Riptano, the source for professional Cassandra support<br>
<a href=3D"http://riptano.com" target=3D"_blank"><a =
href=3D"http://riptano.com">http://riptano.com</a></a></span><o:p></o:p></=
p>

</div>

<p class=3D"MsoNormal" =
style=3D"mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">&nbsp;<o:p></=
o:p></p>

</div>

</div>

</div>

</div>

</div>

<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>

</div>


</div></blockquote></body></html>=

--Apple-Mail-14-72060199--