incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suhail Doshi <digitalwarf...@gmail.com>
Subject Re: multiget_slice
Date Thu, 14 Jan 2010 17:27:23 GMT
Yeah, I think you're right:

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
sda1             34.00    70.00  409.60   11.20 22596.80   649.60    55.24
 32.61   77.67   2.38 100.00
sda2              0.00     0.00    0.00    0.00     0.00     0.00     0.00
  0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           2.25    0.00    0.55   39.72    0.90   56.58

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
sda1             39.60    79.20  357.40   10.00 20022.40   713.60    56.44
 30.04   81.59   2.70  99.12
sda2              0.00     0.00    0.00    0.00     0.00     0.00     0.00
  0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.70    0.00    0.40   38.80    0.90   58.20

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
sda1             43.40    72.00  421.00    9.00 23248.00   648.00    55.57
 32.64   75.74   2.33 100.00
sda2              0.00     0.00    0.00    0.00     0.00     0.00     0.00
  0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           2.00    0.00    0.40   37.41    1.00   59.19

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
sda1             19.60    58.40  348.20    6.60 19691.20   520.00    56.97
 30.88   86.55   2.78  98.80
sda2              1.40     0.00    0.20    0.00    12.80     0.00    64.00
  0.00   20.00  20.00   0.40

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           2.65    0.00    0.65   38.32    0.90   57.48

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
sda1             34.00    61.80  424.40   10.60 24348.80   579.20    57.31
 33.25   76.89   2.30 100.00
sda2              0.00     0.00    0.00    0.00     0.00     0.00     0.00
  0.00    0.00   0.00   0.00

On Thu, Jan 14, 2010 at 10:22 AM, Jonathan Ellis <jbellis@gmail.com> wrote:

> it sounds like you just don't have enough ram for the OS to cache your
> "hot" data set so you are getting killed on disk seeks.  iostat -x 5
> (for example) during load should verify this.
>
> On Thu, Jan 14, 2010 at 11:19 AM, Suhail Doshi <digitalwarfare@gmail.com>
> wrote:
> > Looking at my data directory: 14 G. Just Index.db based files: 4.5 G.
> >
> > Yes only one node so far.
> >
> > vmstat -n 1 -S m
> > procs -----------memory---------- ---swap-- -----io---- -system--
> > ----cpu----
> >  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy
> id
> > wa
> >  0  0     22    585     32   2557    0    0    71    31    5    5  1  0
> 94
> >  3
> >
> > On Thu, Jan 14, 2010 at 10:11 AM, Jonathan Ellis <jbellis@gmail.com>
> wrote:
> >
> >> how much data do you have on disk?  (only on enode?)  how large are
> >> the columns you are reading?  how much ram does vmstat say is being
> >> used for cache?
> >>
> >> On Thu, Jan 14, 2010 at 11:06 AM, Suhail Doshi <
> digitalwarfare@gmail.com>
> >> wrote:
> >> > Right now it's ~5-10 keys, with 5 columns per key.
> >> >
> >> > Later it will be 64 keys (max) with 200 columns per key worst case.
> >> >
> >> > Suhail
> >> >
> >> > On Thu, Jan 14, 2010 at 9:45 AM, Jonathan Ellis <jbellis@gmail.com>
> >> wrote:
> >> >
> >> >> how many keys are you fetching?  how many columns for each key?
> >> >>
> >> >> On Thu, Jan 14, 2010 at 1:49 AM, Suhail Doshi <suhail@mixpanel.com>
> >> wrote:
> >> >> > I've been seeing multiget_slice take an extremely long time:
> >> >> >
> >> >> > 2010-01-14 07:44:00,513 INFO ------------------ Cassandra, delay:
> >> >> > 3.64020800591 -----------------------
> >> >> > 2010-01-14 07:44:00,513 INFO method: multiget_slice
> >> >> > 2010-01-14 07:44:00,513 INFO {'keys':
> >> >> >
> >> >>
> >>
> [u'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:a93ec971e867b23664d990336ce481e0:7516fd43adaa5e0b8a65a672c39845d2',
> >> >> >
> >> >>
> >>
> u'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:fe33779b0db3213f7e354c8e22ad9939:4df200d45716195e86c09a94a54a0c7a',
> >> >> >
> >> >>
> >>
> u'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:71860c77c6745379b0d44304d66b6a13:e37f0136aa3ffaf149b351f6a4c948e9',
> >> >> >
> >> >>
> >>
> u'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:1240f61999709d41292f759e500ad5be:69691c7bdcc3ce6d5d8a1361f22d04ac',
> >> >> >
> >> >>
> >>
> u'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:a6d5b5c3d715b79b59caf7aed18301ac:b53b3a3d6ab90ce0268229151c9bde11'],
> >> >> > 'column_parent': ColumnParent(column_family='DistinctIndex',
> >> >> > super_column=None), 'predicate': SlicePredicate(column_names=None,
> >> >> > slice_range=SliceRange(count=14000, start='date_2009-07-01',
> >> >> reversed=False,
> >> >> > finish='date_2010-01-14'))}
> >> >> >
> >> >> > 2010-01-14 07:44:00,513 INFO result:
> >> >> >
> >> >> > 2010-01-14 07:44:00,513 INFO
> >> >> >
> >> >>
> >>
> {'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:fe33779b0db3213f7e354c8e22ad9939:4df200d45716195e86c09a94a54a0c7a':
> >> >> > [ColumnOrSuperColumn(column=Column(timestamp=1263231323,
> >> >> > name='date_2010-01-11', value='1'), super_column=None),
> >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263333256,
> >> >> > name='date_2010-01-12', value='1'), super_column=None),
> >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263418556,
> >> >> > name='date_2010-01-13', value='1'), super_column=None),
> >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263451804,
> >> >> > name='date_2010-01-14', value='1'), super_column=None)],
> >> >> >
> >> >>
> >>
> 'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:71860c77c6745379b0d44304d66b6a13:e37f0136aa3ffaf149b351f6a4c948e9':
> >> >> > [ColumnOrSuperColumn(column=Column(timestamp=1263231323,
> >> >> > name='date_2010-01-11', value='1'), super_column=None),
> >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263333256,
> >> >> > name='date_2010-01-12', value='1'), super_column=None),
> >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263418556,
> >> >> > name='date_2010-01-13', value='1'), super_column=None),
> >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263451804,
> >> >> > name='date_2010-01-14', value='1'), super_column=None)],
> >> >> >
> >> >>
> >>
> 'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:a6d5b5c3d715b79b59caf7aed18301ac:b53b3a3d6ab90ce0268229151c9bde11':
> >> >> > [ColumnOrSuperColumn(column=Column(timestamp=1263333256,
> >> >> > name='date_2010-01-12', value='1'), super_column=None),
> >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263418556,
> >> >> > name='date_2010-01-13', value='1'), super_column=None),
> >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263451804,
> >> >> > name='date_2010-01-14', value='1'), super_column=None)],
> >> >> >
> >> >>
> >>
> 'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:a93ec971e867b23664d990336ce481e0:7516fd43adaa5e0b8a65a672c39845d2':
> >> >> > [ColumnOrSuperColumn(column=Column(timestamp=1263231323,
> >> >> > name='date_2010-01-11', value='1'), super_column=None),
> >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263333256,
> >> >> > name='date_2010-01-12', value='1'), super_column=None),
> >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263418556,
> >> >> > name='date_2010-01-13', value='1'), super_column=None),
> >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263451804,
> >> >> > name='date_2010-01-14', value='1'), super_column=None)],
> >> >> >
> >> >>
> >>
> 'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:1240f61999709d41292f759e500ad5be:69691c7bdcc3ce6d5d8a1361f22d04ac':
> >> >> > [ColumnOrSuperColumn(column=Column(timestamp=1263231323,
> >> >> > name='date_2010-01-11', value='1'), super_column=None),
> >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263333256,
> >> >> > name='date_2010-01-12', value='1'), super_column=None),
> >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263418556,
> >> >> > name='date_2010-01-13', value='1'), super_column=None),
> >> >> > ColumnOrSuperColumn(column=Column(timestamp=1263451804,
> >> >> > name='date_2010-01-14', value='1'), super_column=None)]}
> >> >> >
> >> >> >
> >> >> > The delay is the time at which it took to run the query and then
> >> return a
> >> >> > result. The box has 4GB of RAM and the *JVM_MAX_MEM (-Xmx) is
set
> at
> >> 3G*.
> >> >> If
> >> >> > you're curious how I am running it:
> >> >> >
> >> >> > /usr/bin/jsvc -home /usr/lib/jvm/java-6-openjdk/jre -pidfile
> >> >> > /var/run/cassandra.pid -errfile &1 -outfile
> >> /var/log/cassandra/output.log
> >> >> > -cp
> >> >> >
> >> >>
> >>
> /usr/share/cassandra/antlr-3.1.3.jar:/usr/share/cassandra/apache-cassandra-incubating-0.5.0-rc1.jar:/usr/share/cassandra/apache-cassandra-incubating.jar:/usr/share/cassandra/clhm-production.jar:/usr/share/cassandra/commons-cli-1.1.jar:/usr/share/cassandra/commons-collections-3.2.1.jar:/usr/share/cassandra/commons-lang-2.4.jar:/usr/share/cassandra/google-collect-1.0-rc1.jar:/usr/share/cassandra/high-scale-lib.jar:/usr/share/cassandra/jline-0.9.94.jar:/usr/share/cassandra/json_simple-1.1.jar:/usr/share/cassandra/junit-4.6.jar:/usr/share/cassandra/libthrift-r894924.jar:/usr/share/cassandra/log4j-1.2.15.jar:/usr/share/cassandra/slf4j-api-1.5.8.jar:/usr/share/cassandra/slf4j-log4j12-1.5.8.jar:/etc/cassandra:/usr/share/java/commons-daemon.jar
> >> >> > -Xmx3G -Xms128M -Dcassandra -Dstorage-config=/etc/cassandra
> >> >> > -Dcom.sun.management.jmxremote.port=8080
> >> >> > -Dcom.sun.management.jmxremote.ssl=false
> >> >> > -Dcom.sun.management.jmxremote.authenticate=false
> >> >> > org.apache.cassandra.service.CassandraDaemon
> >> >> >
> >> >> > I am running version* 0.5.0rc2*.
> >> >> >
> >> >> > Does anyone know what the bottleneck might be and how reads using
> >> >> > multiget_slice can be sped up? When I look at the memory used,
it's
> >> only
> >> >> > about 1700 MB used. The box is not excessively swapping, running
> >> >> > iostat--everything seemed pretty okay.
> >> >> >
> >> >> > Suhail
> >> >> >
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> > http://mixpanel.com
> >> > Blog: http://blog.mixpanel.com
> >> >
> >>
> >
> >
> >
> > --
> > http://mixpanel.com
> > Blog: http://blog.mixpanel.com
> >
>



-- 
http://mixpanel.com
Blog: http://blog.mixpanel.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message