Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 18133 invoked from network); 18 Jun 2010 16:48:22 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 18 Jun 2010 16:48:22 -0000 Received: (qmail 4178 invoked by uid 500); 18 Jun 2010 16:48:21 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 4026 invoked by uid 500); 18 Jun 2010 16:48:20 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 4018 invoked by uid 99); 18 Jun 2010 16:48:20 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 Jun 2010 16:48:20 +0000 X-ASF-Spam-Status: No, hits=4.7 required=10.0 tests=FREEMAIL_FROM,FREEMAIL_REPLY,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of simon.reavely@gmail.com designates 209.85.212.44 as permitted sender) Received: from [209.85.212.44] (HELO mail-vw0-f44.google.com) (209.85.212.44) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 Jun 2010 16:48:15 +0000 Received: by vws15 with SMTP id 15so17855vws.31 for ; Fri, 18 Jun 2010 09:47:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:references:message-id:from:to :in-reply-to:content-type:content-transfer-encoding:mime-version :subject:date:cc:x-mailer; bh=AMrAn77+/R1QVQ7QINNwDHryl56xfJK3TOmvnnRWuH8=; b=RYfaMYz1vIN9VSiSc4DK4C1x1L/4FWZXezyZurT6909a5mG4sT6wSjS+Vvajhrp8aN unKhBMuzF2Gfg0XC8Rot0U0GgeXi+dd9vjGWGCpwSwC098gF0VqXyNzeUPCnbdocjmZj oaTDl8GvaQDRF95Ogy9r2/vdHnTtqRP2a8Yts= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=references:message-id:from:to:in-reply-to:content-type :content-transfer-encoding:mime-version:subject:date:cc:x-mailer; b=XpxBD8DJyD7oiR7p8llfFFl6B8yZK3DjiiWG7tEnnrZP6hq3oY1m4v9p9zPaPtSMCb KpyOItDjagaJbVVS6D+lReHQqRx+2LHumHobrh4h4hf9kIBBtCVj0r/HbfXiLCRoUwgR Vsxqs3Fo5yyfP0fY3aTXHs+f4232nzFuMc6sI= Received: by 10.220.126.224 with SMTP id d32mr576430vcs.160.1276879673871; Fri, 18 Jun 2010 09:47:53 -0700 (PDT) Received: from [192.168.2.2] ([166.199.238.75]) by mx.google.com with ESMTPS id n1sm10388975vcf.40.2010.06.18.09.47.47 (version=TLSv1/SSLv3 cipher=RC4-MD5); Fri, 18 Jun 2010 09:47:52 -0700 (PDT) References: <008c01cb09c6$60713e80$2153bb80$@com> <009701cb09ca$25c93c20$715bb460$@com> <009f01cb09cb$49707930$dc516b90$@com> <007801cb09d1$8ae04700$a0a0d500$@com> Message-Id: <9D948451-B079-4D00-B3E1-70D6C2E035D4@gmail.com> From: Simon Reavely To: "user@cassandra.apache.org" In-Reply-To: <007801cb09d1$8ae04700$a0a0d500$@com> Content-Type: multipart/alternative; boundary=Apple-Mail-14-72060199 Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (iPod Mail 7E18) Subject: Re: read operation is slow Date: Fri, 18 Jun 2010 12:44:46 -0400 Cc: "" X-Mailer: iPod Mail (7E18) --Apple-Mail-14-72060199 Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes Content-Transfer-Encoding: quoted-printable Would it perhaps be worth denormalising your data so that you can =20 retrieve all rows as a single row using a key encoded with the query =20 predicate? Until we get a stored proc feature (dunno if planned) it's hard to =20 avoid round trips without denormalizing/replication of data to fit =20 your query paths Simon Reavely On Jun 11, 2010, at 9:49 PM, "caribbean410" =20 wrote: > Thanks for the suggestion. For the test case, it is 1 key and 1 =20 > column. I once changed 10 to 1, as I remember there is no much =20 > difference. > > > > I have 200k keys and each key is randomly generated. I will try the =20= > optimized query next week. But maybe you still have to face the case =20= > that each time a client just wants to query one key from db. > > > > From: Dop Sun [mailto:sunht@dopsun.com] > Sent: Friday, June 11, 2010 6:05 PM > To: user@cassandra.apache.org > Subject: RE: read operation is slow > > > > And also, you are only select 1 key and 10 columns? > > > > criteria.keyList(Lists.newArrayList(userName)).columnRange=20 > (nameFirst, nameFirst, 10); > > > > Then, if you have 200k keys, you have 200k Thrift calls. If this is =20= > the case, you may need to optimize the way you do the query (to =20 > combine multiple keys into a single query), and to reduce the number =20= > of calls. > > > > From: Dop Sun [mailto:sunht@dopsun.com] > Sent: Saturday, June 12, 2010 8:57 AM > To: user@cassandra.apache.org > Subject: RE: read operation is slow > > > > You mean after you =E2=80=9CI remove some unnecessary column family = and chan=20 > ge the size of rowcache and keycache, now the latency changes from 0=20= > .25ms to 0.09ms. In essence 0.09ms*200k=3D18s.=E2=80=9D, it still = takes 400 se=20 > conds to returning? > > > > From: Caribbean410 [mailto:caribbean410@gmail.com] > Sent: Saturday, June 12, 2010 8:48 AM > To: user@cassandra.apache.org > Subject: Re: read operation is slow > > > > Hi, do you mean this one should not introduce much extra delay? To =20 > read a record, I need select here, not sure where the extra delay =20 > comes from. > > On Fri, Jun 11, 2010 at 5:29 PM, Dop Sun wrote: > > Jassandra is used here: > > > > Map> map =3D criteria.select(); > > > > The select here basically is a call to Thrift API: get_range_slices > > > > > > From: Caribbean410 [mailto:caribbean410@gmail.com] > Sent: Saturday, June 12, 2010 8:00 AM > > > To: user@cassandra.apache.org > Subject: Re: read operation is slow > > > > I remove some unnecessary column family and change the size of =20 > rowcache and keycache, now the latency changes from 0.25ms to =20 > 0.09ms. In essence 0.09ms*200k=3D18s. I don't know why it takes more =20= > than 400s total. Here is the client code and cfstats. There are not =20= > many operations here, why is the extra time so large? > > > > long start =3D System.currentTimeMillis(); > for (int j =3D 0; j < 1; j++) { > for (int i =3D 0; i < numOfRecords; i++) { > int n =3D random.nextInt(numOfRecords); > ICriteria criteria =3D cf.createCriteria(); > userName =3D keySet[n]; > criteria.keyList(Lists.newArrayList=20 > (userName)).columnRange(nameFirst, nameFirst, 10); > Map> map =3D =20 > criteria.select(); > List list =3D map.get(userName); > // ByteArray bloc =3D list.get(0).getValue(); > // byte[] byteArrayloc =3D bloc.toByteArray(); > // loc =3D new String(byteArrayloc); > // readBytes =3D readBytes + loc.length(); > readBytes =3D readBytes + blobSize; > } > } > > long finish=3DSystem.currentTimeMillis(); > > float totalTime=3D(finish-start)/1000; > > > Keyspace: Keyspace1 > Read Count: 600000 > Read Latency: 0.09053006666666667 ms. > Write Count: 200000 > Write Latency: 0.01504989 ms. > Pending Tasks: 0 > Column Family: Standard2 > SSTable count: 3 > Space used (live): 265990358 > Space used (total): 265990358 > Memtable Columns Count: 2615 > Memtable Data Size: 2667300 > Memtable Switch Count: 3 > Read Count: 600000 > Read Latency: 0.091 ms. > Write Count: 200000 > Write Latency: 0.015 ms. > Pending Tasks: 0 > Key cache capacity: 10000000 > Key cache size: 187465 > Key cache hit rate: 0.0 > Row cache capacity: 10000000 > Row cache size: 189990 > Row cache hit rate: 0.68335 > Compacted row minimum size: 0 > Compacted row maximum size: 0 > Compacted row mean size: 0 > > ---------------- > Keyspace: system > Read Count: 1 > Read Latency: 10.954 ms. > Write Count: 4 > Write Latency: 0.28075 ms. > Pending Tasks: 0 > Column Family: HintsColumnFamily > SSTable count: 0 > Space used (live): 0 > Space used (total): 0 > Memtable Columns Count: 0 > Memtable Data Size: 0 > Memtable Switch Count: 0 > Read Count: 0 > Read Latency: NaN ms. > Write Count: 0 > Write Latency: NaN ms. > Pending Tasks: 0 > Key cache capacity: 1 > Key cache size: 0 > Key cache hit rate: NaN > Row cache: disabled > Compacted row minimum size: 0 > Compacted row maximum size: 0 > Compacted row mean size: 0 > > Column Family: LocationInfo > SSTable count: 2 > Space used (live): 3232 > Space used (total): 3232 > Memtable Columns Count: 2 > Memtable Data Size: 46 > Memtable Switch Count: 1 > Read Count: 1 > Read Latency: 10.954 ms. > Write Count: 4 > Write Latency: 0.281 ms. > Pending Tasks: 0 > Key cache capacity: 1 > Key cache size: 1 > Key cache hit rate: 0.0 > Row cache: disabled > Compacted row minimum size: 0 > Compacted row maximum size: 0 > Compacted row mean size: 0 > > ---------------- > > On Fri, Jun 11, 2010 at 1:50 PM, Jonathan Ellis =20= > wrote: > > you need to look at cfstats to see what the latency is internal to > cassandra, vs what your client is introducing > > then you should probably read the comments in the configuration file > about caching > > > On Fri, Jun 11, 2010 at 9:38 AM, Caribbean410 =20 > wrote: > > > > Thanks Riyad. > > > > Right now I am just testing Cassandra on single node. The server =20 > and client > > are running on the same machine. I tried the read test again on two > > machines, on one machine the cpu usage is around 30% most of the =20 > time and > > another is 90%. > > > > Pelops is one way to access Cassandra, there are also other java =20 > client like > > hector and jassandra, will these java clients have significant =20 > different > > performance? > > > > Also I once tried to change the storage configure file, like change > > CommitLogDirectory and DataFileDirectory to different disks, change > > DiskAccessMode to mmap for a 64bit machine, and change =20 > ConcurrentReads from > > 8 to 2. All of these do not change performance much. > > > > For other users who use different access client, like using php, c+=20= > +, > > python, etc, if you have any experience in boosting the read =20 > performance, > > you are more than welcome to share with me. Thanks, > > > > On Fri, Jun 11, 2010 at 8:19 AM, Riyad Kalla =20 > wrote: > >> > >> Caribbean410, > >> > >> This comes up on the Redis list alot as well -- what you are =20 > actually > >> measuring is the client sending a network connection to the Cas =20 > server and > >> it replying -- so the performance numbers you are getting can =20 > easily be 70% > >> network wait time and not necessarily hardcore read/write server > >> performance. > >> One way to see if this is the case, run your read test, then =20 > watch the CPU > >> on the server for the Cassandra process and see if it's pegging =20 > the CPU -- > >> if it's just sitting there banging between 0-10%, the you are =20 > spending most > >> of your time waiting on network i/o (open/close sockets, etc.) > >> If you can parallelize your test to spawn say 5 threads that all =20= > do the > >> same thing, see if the performance for each thread increases =20 > linearly -- > >> which would indicate Cassandra is plenty fast in your setup, you =20= > just need > >> to utilize more client threads over the network. > >> That new Java library, Pelops by Dominic > >> = (http://ria101.wordpress.com/2010/06/11/pelops-the-beautiful-cassandra-dat= abase-client-for-java/=20 > ) > >> has a nice intrinsic node-balancing design that could be handy IF =20= > you are > >> using multiple nodes. If you are just testing against 1 node, =20 > then spawn > >> multiple threads of your code above and see how each thread's =20 > performance > >> scales. > >> -R > >> On Thu, Jun 10, 2010 at 2:39 PM, Caribbean410 = > > >> wrote: > >>> > >>> Hello, > >>> > >>> I am testing the performance of cassandra. We write 200k records =20= > to > >>> database and each record is 1k size. Then we read these 200k =20 > records. > >>> It takes more than 400s to finish the read which is much slower =20= > than > >>> mysql (20s around). I read some discussion online and someone =20 > suggest > >>> to make multiple connections to make it faster. But I am not =20 > sure how > >>> to do it, do I need to change my storage setting file or just =20 > change > >>> the java client code? > >>> > >>> Here is my read code, > >>> > >>> Properties info =3D new Properties(); > >>> info.put(DriverManager.CONSISTENCY_LEVEL, > >>> ConsistencyLevel.ONE.toString()); > >>> > >>> IConnection connection =3D =20 > DriverManager.getConnection( > >>> "thrift://localhost:9160", info); > >>> > >>> // 2. Get a KeySpace by name > >>> IKeySpace keySpace =3D > >>> connection.getKeySpace("Keyspace1"); > >>> > >>> // 3. Get a ColumnFamily by name > >>> IColumnFamily cf =3D > >>> keySpace.getColumnFamily("Standard2"); > >>> > >>> ByteArray nameFirst =3D ByteArray.ofASCII=20 > ("first"); > >>> ICriteria criteria =3D cf.createCriteria(); > >>> long readBytes =3D 0; > >>> long start =3D System.currentTimeMillis(); > >>> for (int i =3D 0; i < numOfRecords; i++) = { > >>> int n =3D random.nextInt=20 > (numOfRecords); > >>> userName =3D keySet[n]; > >>> > >>> criteria.keyList(Lists.newArrayList(userName)).columnRange=20 > (nameFirst, > >>> nameFirst, 10); > >>> Map> =20= > map =3D > >>> criteria.select(); > >>> List list =3D > >>> map.get(userName); > >>> ByteArray bloc =3D > >>> list.get(0).getValue(); > >>> byte[] byteArrayloc =3D > >>> bloc.toByteArray(); > >>> loc =3D new String=20 > (byteArrayloc); > >>> // System.out.println(userName=20= > +" > >>> "+loc); > >>> readBytes =3D readBytes + > >>> loc.length(); > >>> } > >>> > >>> long finish=3DSystem.currentTimeMillis(); > >>> > >>> I once commented these lines > >>> > >>> ByteArray bloc =3D > >>> list.get(0).getValue(); > >>> byte[] byteArrayloc =3D > >>> bloc.toByteArray(); > >>> loc =3D new String=20 > (byteArrayloc); > >>> // System.out.println(userName=20= > +" > >>> "+loc); > >>> readBytes =3D readBytes + > >>> loc.length(); > >>> > >>> And the performance doesn't improve much. > >>> > >>> Any suggestion is welcome. Thanks, > > > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of Riptano, the source for professional Cassandra support > http://riptano.com > > > > --Apple-Mail-14-72060199 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable
Would it perhaps be worth = denormalising your data so that you can retrieve all rows as a single = row using a key encoded with the query = predicate?

Until we get a stored proc feature = (dunno if planned) it's hard to avoid round trips without = denormalizing/replication of data to fit your query = paths
 

Simon = Reavely


On Jun 11, 2010, at 9:49 PM, = "caribbean410" <caribbean410@gmail.com> = wrote:

Thanks for the suggestion. For the test case, it is 1 key and 1 column. I once = changed 10 to 1, as I remember there is no much = difference.

 

I have 200k keys and each key is randomly generated. I will try the = optimized query next week. But maybe you still have to face the case that each = time a client just wants to query one key from db.

 

From: Dop Sun [mailto:sunht@dopsun.com]
Sent: Friday, June 11, 2010 6:05 PM
To: user@cassandra.apache.org
Subject: RE: read operation is slow

 

And also, you are only select 1 key and = 10 columns?

 

criteria.keyList(Lists.newArrayList(userName)).columnRange(= nameFirst, nameFirst, 10);

 

Then, if you have 200k keys, you have 200k Thrift calls.  If this is the case, you may need to optimize the way you do the = query (to combine multiple keys into a single query), and to reduce the number = of calls.

 

From: Dop Sun [mailto:sunht@dopsun.com]
Sent: Saturday, June 12, 2010 8:57 AM
To: user@cassandra.apache.org
Subject: RE: read operation is slow

 

You mean after you =E2=80=9CI remove some unnecessary = column family and change the size of rowcache and keycache, now the latency changes from = 0.25ms to 0.09ms. In essence 0.09ms*200k=3D18s.=E2=80=9D, it still takes 400 = seconds to returning?

 

From: Caribbean410 [mailto:caribbean410@gmail.com]
Sent: Saturday, June 12, 2010 8:48 AM
To: user@cassandra.apache.org
Subject: Re: read operation is slow

 

Hi, do you mean = this one should not introduce much extra delay? To read a record, I need select here, = not sure where the extra delay comes from.

On Fri, Jun 11, 2010 at 5:29 PM, Dop Sun <sunht@dopsun.com> = wrote:

Jassandra is used = here:

 

Map<String, List<IColumn>> map =3D criteria.select();

 

The select here basically is a = call to Thrift API: get_range_slices

 

 

From: Caribbean410 [mailto:caribbean410@gmail.com]
Sent: Saturday, June 12, 2010 8:00 AM


To: user@cassandra.apache.org
Subject: Re: read operation is slow

 

I remove some unnecessary column family and change the size of rowcache = and keycache, now the latency changes from 0.25ms to 0.09ms. In essence = 0.09ms*200k=3D18s. I don't know why it takes more than 400s total. Here is the client code = and cfstats. There are not many operations here, why is the extra time so = large?



              = long start =3D System.currentTimeMillis();
              for = (int j =3D 0; j < 1; j++) {
            =       for (int i =3D 0; i < numOfRecords; i++) {
            =           int n =3D random.nextInt(numOfRecords);
            =           ICriteria criteria =3D = cf.createCriteria();
            =           userName =3D keySet[n];
            =           criteria.keyList(Lists.newArrayList(userName)).columnRange(nameFirst, nameFirst, 10);                           =                   =
            =           Map<String, List<IColumn>> = map =3D criteria.select();
            =           List<IColumn> list =3D = map.get(userName);
//            =           ByteArray bloc =3D = list.get(0).getValue();
//            =           byte[] byteArrayloc =3D = bloc.toByteArray();
//            =           loc =3D new String(byteArrayloc);        =              
//            =           readBytes =3D readBytes + = loc.length();
            =           readBytes =3D readBytes + blobSize;
            =       }
              = }
                            =
            long finish=3DSystem.currentTimeMillis();
           
            float = totalTime=3D(finish-start)/1000;


Keyspace: Keyspace1
    Read Count: 600000
    Read Latency: 0.09053006666666667 ms.
    Write Count: 200000
    Write Latency: 0.01504989 ms.
    Pending Tasks: 0
        Column Family: Standard2
        SSTable count: 3
        Space used (live): 265990358
        Space used (total): 265990358
        Memtable Columns Count: 2615
        Memtable Data Size: 2667300
        Memtable Switch Count: 3
        Read Count: 600000
        Read Latency: 0.091 ms.
        Write Count: 200000
        Write Latency: 0.015 ms.
        Pending Tasks: 0
        Key cache capacity: 10000000
        Key cache size: 187465
        Key cache hit rate: 0.0
        Row cache capacity: 10000000
        Row cache size: 189990
        Row cache hit rate: 0.68335
        Compacted row minimum size: 0
        Compacted row maximum size: 0
        Compacted row mean size: 0

----------------
Keyspace: system
    Read Count: 1
    Read Latency: 10.954 ms.
    Write Count: 4
    Write Latency: 0.28075 ms.
    Pending Tasks: 0
        Column Family: = HintsColumnFamily
        SSTable count: 0
        Space used (live): 0
        Space used (total): 0
        Memtable Columns Count: 0
        Memtable Data Size: 0
        Memtable Switch Count: 0
        Read Count: 0
        Read Latency: NaN ms.
        Write Count: 0
        Write Latency: NaN ms.
        Pending Tasks: 0
        Key cache capacity: 1
        Key cache size: 0
        Key cache hit rate: NaN
        Row cache: disabled
        Compacted row minimum size: 0
        Compacted row maximum size: 0
        Compacted row mean size: 0

        Column Family: LocationInfo
        SSTable count: 2
        Space used (live): 3232
        Space used (total): 3232
        Memtable Columns Count: 2
        Memtable Data Size: 46
        Memtable Switch Count: 1
        Read Count: 1
        Read Latency: 10.954 ms.
        Write Count: 4
        Write Latency: 0.281 ms.
        Pending Tasks: 0
        Key cache capacity: 1
        Key cache size: 1
        Key cache hit rate: 0.0
        Row cache: disabled
        Compacted row minimum size: 0
        Compacted row maximum size: 0
        Compacted row mean size: 0

----------------

On Fri, Jun 11, 2010 at 1:50 PM, Jonathan Ellis <jbellis@gmail.com> wrote:

you need to look at cfstats to see what the latency is internal to
cassandra, vs what your client is introducing

then you should probably read the comments in the configuration file
about caching


On Fri, Jun 11, 2010 at 9:38 AM, Caribbean410 <caribbean410@gmail.com> wrote:
>
> Thanks Riyad.
>
> Right now I am just testing Cassandra on single node. The server = and client
> are running on the same machine. I tried the read test again on = two
> machines, on one machine the cpu usage is around 30% most of the = time and
> another is 90%.
>
> Pelops is one way to access Cassandra, there are also other java = client like
> hector and jassandra, will these java clients have significant = different
> performance?
>
> Also I once tried to change the storage configure file, like = change
> CommitLogDirectory and DataFileDirectory to different disks, = change
> DiskAccessMode to mmap for a 64bit machine, and change = ConcurrentReads from
> 8 to 2. All of these do not change performance much.
>
> For other users who use different access client, like using php, = c++,
> python, etc, if you have any experience in boosting the read = performance,
> you are more than welcome to share with me. Thanks,
>
> On Fri, Jun 11, 2010 at 8:19 AM, Riyad Kalla <rkalla@gmail.com> wrote:
>>
>> Caribbean410,
>>
>> This comes up on the Redis list alot as well -- what you are = actually
>> measuring is the client sending a network connection to the Cas = server and
>> it replying -- so the performance numbers you are getting can = easily be 70%
>> network wait time and not necessarily hardcore read/write = server
>> performance.
>> One way to see if this is the case, run your read test, then = watch the CPU
>> on the server for the Cassandra process and see if it's pegging = the CPU --
>> if it's just sitting there banging between 0-10%, the you are = spending most
>> of your time waiting on network i/o (open/close sockets, = etc.)
>> If you can parallelize your test to spawn say 5 threads that = all do the
>> same thing, see if the performance for each thread increases linearly --
>> which would indicate Cassandra is plenty fast in your setup, = you just need
>> to utilize more client threads over the network.
>> That new Java library, Pelops by Dominic
>> (http://ria101.wordpress.com/2010/06/11/pelo= ps-the-beautiful-cassandra-database-client-for-java/)
>> has a nice intrinsic node-balancing design that could be handy = IF you are
>> using multiple nodes. If you are just testing against 1 node, = then spawn
>> multiple threads of your code above and see how each thread's performance
>> scales.
>> -R
>> On Thu, Jun 10, 2010 at 2:39 PM, Caribbean410 <caribbean410@gmail.com><= br> >> wrote:
>>>
>>> Hello,
>>>
>>> I am testing the performance of cassandra. We write 200k = records to
>>> database and each record is 1k size. Then we read these = 200k records.
>>> It takes more than 400s to finish the read which is much = slower than
>>> mysql (20s around). I read some discussion online and = someone suggest
>>> to make multiple connections to make it faster. But I am = not sure how
>>> to do it, do I need to change my storage setting file or = just change
>>> the java client code?
>>>
>>> Here is my read code,
>>>
>>>                 =     Properties info =3D new Properties();
>>>                 =     info.put(DriverManager.CONSISTENCY_LEVEL,
>>>                 =               = ConsistencyLevel.ONE.toString());
>>>
>>>                 =     IConnection connection =3D DriverManager.getConnection(
>>>                 =                 = "thrift://localhost:9160", info);
>>>
>>>                 =       // 2. Get a KeySpace by name
>>>                 =       IKeySpace keySpace =3D
>>> connection.getKeySpace("Keyspace1");
>>>
>>>                 =       // 3. Get a ColumnFamily by name
>>>                 =       IColumnFamily cf =3D
>>> keySpace.getColumnFamily("Standard2");
>>>
>>>                 =       ByteArray nameFirst =3D ByteArray.ofASCII("first");
>>>                 =       ICriteria criteria =3D cf.createCriteria();
>>>                 =       long readBytes =3D 0;
>>>                 =       long start =3D System.currentTimeMillis();
>>>                 =           for (int i =3D 0; i < numOfRecords; i++) = {
>>>                 =                   int n =3D random.nextInt(numOfRecords);
>>>                 =                       = userName =3D keySet[n];
>>>
>>> = criteria.keyList(Lists.newArrayList(userName)).columnRange(nameFirst,
>>> nameFirst, 10);
>>>                 =                       Map<String, List<IColumn>> map =3D
>>> criteria.select();
>>>                 =                       List<IColumn> list =3D
>>> map.get(userName);
>>>                 =                       = ByteArray bloc =3D
>>> list.get(0).getValue();
>>>                 =                       = byte[] byteArrayloc =3D
>>> bloc.toByteArray();
>>>                 =                       = loc =3D new String(byteArrayloc);
>>> //                =                     System.out.println(userName+"
>>> "+loc);
>>>                 =                       = readBytes =3D readBytes +
>>> loc.length();
>>>                 =           }
>>>
>>>                 =         long finish=3DSystem.currentTimeMillis();
>>>
>>> I once commented these lines
>>>
>>>                 =                       = ByteArray bloc =3D
>>> list.get(0).getValue();
>>>                 =                       = byte[] byteArrayloc =3D
>>> bloc.toByteArray();
>>>                 =                       = loc =3D new String(byteArrayloc);
>>> //                =                     System.out.println(userName+"
>>> "+loc);
>>>                 =                       = readBytes =3D readBytes +
>>> loc.length();
>>>
>>> And the performance doesn't improve much.
>>>
>>> Any suggestion is welcome. Thanks,
>
>

--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

 

 

= --Apple-Mail-14-72060199--