cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dinesh.joshi@yahoo.com.INVALID" <dinesh.jo...@yahoo.com.INVALID>
Subject Re: Modeling Time Series data
Date Sat, 12 Jan 2019 01:09:25 GMT
Hi Akash,
There are a lot of interesting articles written around this topic.
   
   - http://thelastpickle.com/blog/2017/08/02/time-series-data-modeling-massive-scale.html
  

   - https://medium.com/netflix-techblog/scaling-time-series-data-storage-part-i-ec2b6d44ba39
  


You shouldn't need to worry about hotspots if you select the partition key carefully and your
cluster is configured properly. Please go through the links and if you have more clarification,
please feel free to ask more questions here.
Thanks,
Dinesh 

    On Friday, January 11, 2019, 2:45:42 PM PST, Akash Gangil <akashg1611@gmail.com>
wrote:  
 
 Hi, 

I have a data model where the partition key for a lot of tables is based on time 
(year, month, day, hour)
Would this create a hotspot in my cluster, given all the writes/reads would go to the same
node for a given hour? Or does the cassandra storage engine also takes into account the table
info like table name, when distributing the data?
If the above model would be a problem, what's the suggested way to solve this? Add tablename
to partition key?

-- 
Akash
  
Mime
View raw message