incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Lebresne <sylv...@yakaz.com>
Subject Re: Range scan performance in 0.6.0 beta2
Date Thu, 25 Mar 2010 14:17:14 GMT
I don't know If that could play any role, but if ever you have
disabled the assertions
when running cassandra (that is, you removed the -ea line in
cassandra.in.sh), there
was a bug in 0.6beta2 that will make read in row with lots of columns
quite slow.

Another problem you may have is if you have the commitLog directory on the same
hard drive than the data directory. If that's the case and you read
and write at the
same time, that may be a reason for poor read performances (and write too).

As for the row with 30 millions columns, you have to be aware that right now,
cassandra will deserialize whole rows during compaction
(http://wiki.apache.org/cassandra/CassandraLimitations).
So depending on the size of what you store in you column, you could
very well hit
that limitation (that could be why you OOM). In which case, I see two choices:
1) add more RAM to the machine or 2) change your data structure to
avoid that (maybe
can you split rows with too many columns somehow ?).

--
Sylvain

On Thu, Mar 25, 2010 at 2:33 PM, Henrik Schröder <skrolle@gmail.com> wrote:
> Hi everyone,
>
> We're trying to implement a virtual datastore for our users where they can
> set up "tables" and "indexes" to store objects and have them indexed on
> arbitrary properties. And we did a test implementation for Cassandra in the
> following way:
>
> Objects are stored in one columnfamily, each key is made up of tableid +
> "object key", and each row has one column where the value is the serialized
> object. This part is super-simple, we're just using Cassandra as a
> key-value-store, and this part performs really well.
>
> The indexes are a bit tricker, but basically for each index and each object
> that is stored, we compute a fixed-length bytearray based on the object that
> make up the indexvalue. We then store these bytearray indexvalues in another
> columnfamily, with the indexid as row key, the indexvalue as the column
> name, and the object key as the column value.
>
> The idea is then that to perform a range query on an "index" in this virtual
> datastore, we do a get_slice to get the range of indexvalues and their
> corresponding object keys, and we can then multi_get the actual objects from
> the other column family.
>
> Since these virtual tables and indexes will be created by our users the
> whole system has to be very dynamic, and we can't make any assumptions about
> the actual objects they will store and the distribution of these. We do know
> that it must be able to scale well, and this is what attracted us to
> Cassandra in the first place. We do however have some performance targets we
> want to hit, we have one use-case where there will be about 30 million
> records in a "table", and knowing that it can go up to 100 million records
> would be nice. As for speed, would like to get thousands of writes and range
> reads per second.
>
> Given these requirements and our design, we will then have rows in Cassandra
> with millions of columns, from which we want to fetch large column slices.
> We set it all up on a single developer machine (MacPro, QuadCore 2.66ghz)
> running Windows, and we used the thrift compiler to generate a C# client
> library. We tested just the "index" part of our design, and these are the
> numbers we got:
> inserts (15 threads, batches of 10): 4000/second
> get_slices (10 threads, random range sizes, count 1000): 50/second at start,
> dies at about 6 million columns inserted. (OutOfMemoryException)
> get_slices (10 threads, random range sizes, count 10): 200/s at start, slows
> down the more columns there are.
>
>
> When we saw that the above results were bad, we tried a different approach
> storing the indexvalues in the key instead, using the
> OrderPreservingPartitioner and using get_range_slice to get ranges of rows,
> but we got even worse results:
> inserts (15 threads, in batches of 10): 4000/second
> get_range_slice (10 threads, random key ranges, count 1000): 20/second at
> start, 5/second with 30 million rows
>
>
> Finally, we did a similar test using MySQL instead and then we got these
> numbers:
> inserts (15 threads, in batches of 10): 4000/second
> select (10 threads, limit 1000): 500/second
>
> So for us, the MySQL version delivers the speed that we want, but none of
> the scaling that Cassandra gives us. We set up our columnfamilies like this:
>
> <ColumnFamily CompareWith="BytesType" Name="Objects" RowsCached="0"
> KeysCached="0"/>
> <ColumnFamily CompareWith="BytesType" Name="Indexes" RowsCached="0"
> KeysCached="0"/>
>
> And we now have these questions:
> a) Is there a better way of structuring our data and building the virtual
> indexes?
> b) Are our Cassandra numbers too low? Or is this the expected performance?
> c) Did we miss to change some important setting (in the conf xml or java
> config) since our rows are this large?
> d) Can we avoid hitting the Out of memory exception?
>
>
> /Henrik
>

Mime
View raw message