cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kenneth Brotman" <kenbrot...@yahoo.com.INVALID>
Subject RE: data modelling
Date Wed, 06 Mar 2019 01:09:56 GMT
You definitely don’t need a secondary index.  A MV might be the answer.  

 

How many tagids does a sensor have ?

Do you have to use a collection for tagids?

How many sensors would you expect to have a particular tagid?

Would you know the customerid and sensorid and be able to specify that in the query?

 

If you could have tagid not be a collection, and make it part of the primary key, that would
help a lot.

  

 

From: Kenneth Brotman [mailto:kenbrotman@yahoo.com.INVALID] 
Sent: Tuesday, March 05, 2019 4:33 PM
To: user@cassandra.apache.org
Subject: RE: data modelling

 

Hi Bobbie,

 

You’re not giving enough information to model the data.  With Cassandra it’s based on
the queries you are going to need.  This link to Jeffrey Carpenter’s book, Cassandra the
Definitive Guide, Chapter 5, which is on how to do data modeling for Cassandra, should be
of help to you: https://books.google.com/books?id=uW-PDAAAQBAJ <https://books.google.com/books?id=uW-PDAAAQBAJ&pg=PA79&lpg=PA79&dq=jeff+carpenter+chapter+5&source=bl&ots=58cM-BII2M&sig=ACfU3U0-188Fw-jcj1tbMItdPlNH8Lk9yQ&hl=en&sa=X&ved=2ahUKEwinrY3OoezgAhWoHDQIHRfmA7IQ6AEwA3oECAcQAQ#v=onepage&q=jeff%20carpenter%20chapter%205&f=false>
&pg=PA79&lpg=PA79&dq=jeff+carpenter+chapter+5&source=bl&ots=58cM-BII2M&sig=ACfU3U0-188Fw-jcj1tbMItdPlNH8Lk9yQ&hl=en&sa=X&ved=2ahUKEwinrY3OoezgAhWoHDQIHRfmA7IQ6AEwA3oECAcQAQ#v=onepage&q=jeff%20carpenter%20chapter%205&f=false

 

 

 

From: Bobbie Haynes [mailto:haynes30349@gmail.com] 
Sent: Tuesday, March 05, 2019 4:19 PM
To: user@cassandra.apache.org
Subject: data modelling

 

Hi 

   Could you help  modelling this usecase 

 

   I have below table ..I will update tagid's columns set(bigit) based on PK. I have created
the secondary index column on tagid to query like below..

 

Select * from keyspace.customer_sensor_tagids where tagids CONTAINS 11358097;

 

this query is doing the range scan because of the secondary index.. and causing performance
issues 

 

If i create a MV on Tagid's can i be able to query like above.. please suggest a Datamodel
for this scenario.Apprecite your help on this.

-----------------------------------------------------------------------------------------------

-----------------------------------------------------------------------------------------------

example of Tagids for each row:-

   4608831, 608886, 608890, 609164, 615024, 679579, 814791, 830404, 71756, 8538307, 9936868,
10883336, 10954034, 10958062, 10976553, 10976554, 10980255, 11009971, 11043805, 11075379,
11078819, 11167844, 11358097, 11479340, 11481769, 11481770, 11481771, 11481772, 11693597,
11709012, 12193230, 12421500, 12421516, 12421781, 12422011, 12422368, 12422501, 12422512,
12422553, 12422555, 12423381, 12423382

 

   -----------------------------------------------------------------------------------------------

-----------------------------------------------------------------------------------------------


 

   CREATE TABLE keyspace.customer_sensor_tagids (

    customerid bigint,

    sensorid bigint,

    XXX frozen<activity>,

    XXX frozen<asset>,

    XXX text,

    XXX text,

    XXX frozen<hardware>,

    XXX bigint,

    XXX bigint,

    XXX list<frozen<interface>>,

    XXX frozen<inventory>,

    XXX boolean,

    XXX bigint,

    XXX list<frozen<openport>>,

    XXX frozen<operatingsystem>,

    XXX bigint,

    XXX bigint,

    XXX list<frozen<processor>>,

    XXX list<frozen<service>>,

    XXX set<frozen<software>>,

    XXX set<bigint>,

    XXX set<bigint>,

    tagids set<bigint>,

    XXX bigint,

    XXX list<frozen<volume>>,

    PRIMARY KEY ((customerid, sensorid))

) WITH bloom_filter_fp_chance = 0.01

    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}

    AND comment = ''

    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '4'}

    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}

    AND crc_check_chance = 1.0

    AND dclocal_read_repair_chance = 0.1

    AND default_time_to_live = 0

    AND gc_grace_seconds = 864000

    AND max_index_interval = 2048

    AND memtable_flush_period_in_ms = 0

    AND min_index_interval = 128

    AND read_repair_chance = 0.0

    AND speculative_retry = '99PERCENTILE';

CREATE INDEX XXX ON keyspace.customer_sensor_tagids (values(tagids));

CREATE INDEX XXX ON keyspace.customer_sensor_tagids (values(XXX));

CREATE INDEX XXX ON keyspace.customer_sensor_tagids (XXX);

CREATE INDEX XXX ON keyspace.customer_sensor_tagids (XXX);

CREATE INDEX XXX ON keyspace.customer_sensor_tagids (XXX);

CREATE INDEX XXX ON keyspace.customer_sensor_tagids (XXX);

CREATE INDEX XXX ON keyspace.customer_sensor_tagids (values(XXX));

CREATE INDEX XXX ON keyspace.customer_sensor_tagids (XXX);


Mime
View raw message