incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Allan C <alla...@gmail.com>
Subject Re: Multiget performance
Date Fri, 11 Apr 2014 18:02:09 GMT
 It’s a fairly standard relational-like CF. Description is the only field that’s potentially
big (can be up to 1k).

CREATE COLUMN FAMILY 'Event' WITH
  key_validation_class = 'UTF8Type' AND
  comparator = 'UTF8Type' AND
  default_validation_class = 'UTF8Type' AND
  bloom_filter_fp_chance = 0.1 AND
  compaction_strategy = 'LeveledCompactionStrategy' AND
  compaction_strategy_options = {sstable_size_in_mb:160} AND
  compression_options = {sstable_compression:SnappyCompressor,chunk_length_kb:64} AND
--  key_alias = 'eventId' AND
  column_metadata = [
      {column_name: 'createdAt', validation_class: 'DateType'},
      {column_name: 'creatorId', validation_class: 'UTF8Type'},
      {column_name: 'creatorName', validation_class: 'UTF8Type'},
      {column_name: 'description', validation_class: 'UTF8Type'},
      {column_name: 'privacy', validation_class: 'UTF8Type'},
      {column_name: 'location', validation_class: 'UTF8Type'},
      {column_name: 'locationId', validation_class: 'UTF8Type'},
      {column_name: 'endTime', validation_class: 'DateType'},
      {column_name: 'name', validation_class: 'UTF8Type'},
      {column_name: 'picture', validation_class: 'UTF8Type'},
      {column_name: 'startTime', validation_class: 'DateType'},
      {column_name: 'updatedAt', validation_class: 'DateType'},

      {column_name: 'lat', validation_class: 'UTF8Type'},
      {column_name: 'lng', validation_class: 'UTF8Type'},
      {column_name: 'street', validation_class: 'UTF8Type'},
      {column_name: 'city', validation_class: 'UTF8Type'},
      {column_name: 'state', validation_class: 'UTF8Type'},
      {column_name: 'zip', validation_class: 'UTF8Type'},
      {column_name: 'country', validation_class: 'UTF8Type'},

      {column_name: '~lastSync', validation_class: 'DateType'},
      {column_name: '~nextSync', validation_class: 'DateType'},

      {column_name: '~syncBlock', validation_class: 'IntegerType'},

      {column_name: 'noCount', validation_class: 'IntegerType'},
      {column_name: 'invitedCount', validation_class: 'IntegerType'},
      {column_name: 'maybeCount', validation_class: 'IntegerType'},
      {column_name: 'yesCount', validation_class: 'IntegerType'},

      {column_name: '~version', validation_class: 'IntegerType'}
];


-Allan

On April 10, 2014 at 4:49:34 PM, Tyler Hobbs (tyler@datastax.com) wrote:


On Thu, Apr 10, 2014 at 6:26 PM, Allan C <allanca@gmail.com> wrote:

Looks like the amount of data returned has a big effect. When I only return one column, python
reports only 20ms compared to 150ms when returning the whole row. Rows are each less than
1k in size, but there must be client overhead.

That's a surprising amount of overhead in pycassa.  What's your schema like for this CF?


--
Tyler Hobbs
DataStax

Mime
View raw message