incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: multiget_slice
Date Thu, 14 Jan 2010 17:11:24 GMT
how much data do you have on disk?  (only on enode?)  how large are
the columns you are reading?  how much ram does vmstat say is being
used for cache?

On Thu, Jan 14, 2010 at 11:06 AM, Suhail Doshi <digitalwarfare@gmail.com> wrote:
> Right now it's ~5-10 keys, with 5 columns per key.
>
> Later it will be 64 keys (max) with 200 columns per key worst case.
>
> Suhail
>
> On Thu, Jan 14, 2010 at 9:45 AM, Jonathan Ellis <jbellis@gmail.com> wrote:
>
>> how many keys are you fetching?  how many columns for each key?
>>
>> On Thu, Jan 14, 2010 at 1:49 AM, Suhail Doshi <suhail@mixpanel.com> wrote:
>> > I've been seeing multiget_slice take an extremely long time:
>> >
>> > 2010-01-14 07:44:00,513 INFO ------------------ Cassandra, delay:
>> > 3.64020800591 -----------------------
>> > 2010-01-14 07:44:00,513 INFO method: multiget_slice
>> > 2010-01-14 07:44:00,513 INFO {'keys':
>> >
>> [u'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:a93ec971e867b23664d990336ce481e0:7516fd43adaa5e0b8a65a672c39845d2',
>> >
>> u'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:fe33779b0db3213f7e354c8e22ad9939:4df200d45716195e86c09a94a54a0c7a',
>> >
>> u'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:71860c77c6745379b0d44304d66b6a13:e37f0136aa3ffaf149b351f6a4c948e9',
>> >
>> u'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:1240f61999709d41292f759e500ad5be:69691c7bdcc3ce6d5d8a1361f22d04ac',
>> >
>> u'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:a6d5b5c3d715b79b59caf7aed18301ac:b53b3a3d6ab90ce0268229151c9bde11'],
>> > 'column_parent': ColumnParent(column_family='DistinctIndex',
>> > super_column=None), 'predicate': SlicePredicate(column_names=None,
>> > slice_range=SliceRange(count=14000, start='date_2009-07-01',
>> reversed=False,
>> > finish='date_2010-01-14'))}
>> >
>> > 2010-01-14 07:44:00,513 INFO result:
>> >
>> > 2010-01-14 07:44:00,513 INFO
>> >
>> {'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:fe33779b0db3213f7e354c8e22ad9939:4df200d45716195e86c09a94a54a0c7a':
>> > [ColumnOrSuperColumn(column=Column(timestamp=1263231323,
>> > name='date_2010-01-11', value='1'), super_column=None),
>> > ColumnOrSuperColumn(column=Column(timestamp=1263333256,
>> > name='date_2010-01-12', value='1'), super_column=None),
>> > ColumnOrSuperColumn(column=Column(timestamp=1263418556,
>> > name='date_2010-01-13', value='1'), super_column=None),
>> > ColumnOrSuperColumn(column=Column(timestamp=1263451804,
>> > name='date_2010-01-14', value='1'), super_column=None)],
>> >
>> 'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:71860c77c6745379b0d44304d66b6a13:e37f0136aa3ffaf149b351f6a4c948e9':
>> > [ColumnOrSuperColumn(column=Column(timestamp=1263231323,
>> > name='date_2010-01-11', value='1'), super_column=None),
>> > ColumnOrSuperColumn(column=Column(timestamp=1263333256,
>> > name='date_2010-01-12', value='1'), super_column=None),
>> > ColumnOrSuperColumn(column=Column(timestamp=1263418556,
>> > name='date_2010-01-13', value='1'), super_column=None),
>> > ColumnOrSuperColumn(column=Column(timestamp=1263451804,
>> > name='date_2010-01-14', value='1'), super_column=None)],
>> >
>> 'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:a6d5b5c3d715b79b59caf7aed18301ac:b53b3a3d6ab90ce0268229151c9bde11':
>> > [ColumnOrSuperColumn(column=Column(timestamp=1263333256,
>> > name='date_2010-01-12', value='1'), super_column=None),
>> > ColumnOrSuperColumn(column=Column(timestamp=1263418556,
>> > name='date_2010-01-13', value='1'), super_column=None),
>> > ColumnOrSuperColumn(column=Column(timestamp=1263451804,
>> > name='date_2010-01-14', value='1'), super_column=None)],
>> >
>> 'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:a93ec971e867b23664d990336ce481e0:7516fd43adaa5e0b8a65a672c39845d2':
>> > [ColumnOrSuperColumn(column=Column(timestamp=1263231323,
>> > name='date_2010-01-11', value='1'), super_column=None),
>> > ColumnOrSuperColumn(column=Column(timestamp=1263333256,
>> > name='date_2010-01-12', value='1'), super_column=None),
>> > ColumnOrSuperColumn(column=Column(timestamp=1263418556,
>> > name='date_2010-01-13', value='1'), super_column=None),
>> > ColumnOrSuperColumn(column=Column(timestamp=1263451804,
>> > name='date_2010-01-14', value='1'), super_column=None)],
>> >
>> 'property:1558:1f0351b7f85b4aa070548e5fd5e08ddf:fce1eab4411d5df240d93ff334f15385:1240f61999709d41292f759e500ad5be:69691c7bdcc3ce6d5d8a1361f22d04ac':
>> > [ColumnOrSuperColumn(column=Column(timestamp=1263231323,
>> > name='date_2010-01-11', value='1'), super_column=None),
>> > ColumnOrSuperColumn(column=Column(timestamp=1263333256,
>> > name='date_2010-01-12', value='1'), super_column=None),
>> > ColumnOrSuperColumn(column=Column(timestamp=1263418556,
>> > name='date_2010-01-13', value='1'), super_column=None),
>> > ColumnOrSuperColumn(column=Column(timestamp=1263451804,
>> > name='date_2010-01-14', value='1'), super_column=None)]}
>> >
>> >
>> > The delay is the time at which it took to run the query and then return a
>> > result. The box has 4GB of RAM and the *JVM_MAX_MEM (-Xmx) is set at 3G*.
>> If
>> > you're curious how I am running it:
>> >
>> > /usr/bin/jsvc -home /usr/lib/jvm/java-6-openjdk/jre -pidfile
>> > /var/run/cassandra.pid -errfile &1 -outfile /var/log/cassandra/output.log
>> > -cp
>> >
>> /usr/share/cassandra/antlr-3.1.3.jar:/usr/share/cassandra/apache-cassandra-incubating-0.5.0-rc1.jar:/usr/share/cassandra/apache-cassandra-incubating.jar:/usr/share/cassandra/clhm-production.jar:/usr/share/cassandra/commons-cli-1.1.jar:/usr/share/cassandra/commons-collections-3.2.1.jar:/usr/share/cassandra/commons-lang-2.4.jar:/usr/share/cassandra/google-collect-1.0-rc1.jar:/usr/share/cassandra/high-scale-lib.jar:/usr/share/cassandra/jline-0.9.94.jar:/usr/share/cassandra/json_simple-1.1.jar:/usr/share/cassandra/junit-4.6.jar:/usr/share/cassandra/libthrift-r894924.jar:/usr/share/cassandra/log4j-1.2.15.jar:/usr/share/cassandra/slf4j-api-1.5.8.jar:/usr/share/cassandra/slf4j-log4j12-1.5.8.jar:/etc/cassandra:/usr/share/java/commons-daemon.jar
>> > -Xmx3G -Xms128M -Dcassandra -Dstorage-config=/etc/cassandra
>> > -Dcom.sun.management.jmxremote.port=8080
>> > -Dcom.sun.management.jmxremote.ssl=false
>> > -Dcom.sun.management.jmxremote.authenticate=false
>> > org.apache.cassandra.service.CassandraDaemon
>> >
>> > I am running version* 0.5.0rc2*.
>> >
>> > Does anyone know what the bottleneck might be and how reads using
>> > multiget_slice can be sped up? When I look at the memory used, it's only
>> > about 1700 MB used. The box is not excessively swapping, running
>> > iostat--everything seemed pretty okay.
>> >
>> > Suhail
>> >
>>
>
>
>
> --
> http://mixpanel.com
> Blog: http://blog.mixpanel.com
>

Mime
View raw message