incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Carlos Sanchez <carlos.sanc...@riskmetrics.com>
Subject Insertion time question
Date Tue, 30 Mar 2010 23:16:28 GMT
I was wondering if I could have a bit more insight as why we are seeing different insertion
times between regular column families and super columns.

We have a group object (with its name) that may have a series of attributes (name/value).
There can be up a million group object and different groups can share several attributes.
In our first design we had a super column we have the column path as

        ColumnPath ("Index", [attribute value], [group name]) and row key is the attribute
name. The value
        we are inserting is an empty byte array

In the second design we simply our model and

        ColumnPath ("Index", null, [group name]) and the row key is simply the attribute name
concatenated      with the attribute value. The value inserted again is an empty array

In the first case we, inserting 250K group it took about 1.5 hours and in the second case
it took 45 minutes. In both tests, we started Cassandra with no data, using OPP in two nodes
(each 16 core 64 GB)

We are wondering why inserting when using super columns we get lower performance.

Thanks,

Carlos




This email message and any attachments are for the sole use of the intended recipients and
may contain proprietary and/or confidential information which may be privileged or otherwise
protected from disclosure. Any unauthorized review, use, disclosure or distribution is prohibited.
If you are not an intended recipient, please contact the sender by reply email and destroy
the original message and any copies of the message as well as any attachments to the original
message.

Mime
View raw message