cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anuj Wadehra <anujw_2...@yahoo.co.in>
Subject Re: RE: Manual Indexing With Buckets
Date Wed, 29 Jul 2015 00:54:43 GMT
Any more thoughts ? Anyone?


Thanks

Anuj

Sent from Yahoo Mail on Android

From:"Anuj Wadehra" <anujw_2003@yahoo.co.in>
Date:Sat, 25 Jul, 2015 at 5:14 pm
Subject:Re: RE: Manual Indexing With Buckets

We are in product development and batch size depends on the customer base of customer buying
our product. Huge customers buying product may have huge batches while small customers may
have much smaller ones. So we dont know upgront how many buckets per batch would be required
and we dont wanna ask for additional configuration from our customer to input average batch
size. So, we are planning to use dynamic bucketing. Every row in primary is associated with
only one batch.


Comments required on the following:

1. I want to know any suggestios on proposed design?

2. Whats the best approach for updating/deleting from index table. When a row is manually
purged from primary table, we dont know where that row key exists in x number of buckets created
for its batch id? 

 

Thanks

Anuj

Sent from Yahoo Mail on Android

From:"SEAN_R_DURITY@homedepot.com" <SEAN_R_DURITY@homedepot.com>
Date:Fri, 24 Jul, 2015 at 5:39 pm
Subject:RE: Manual Indexing With Buckets

It is a bit hard to follow. Perhaps you could include your proposed schema (annotated with
your size predictions) to spur more discussion. To me, it sounds a bit convoluted. Why is
a “batch” so big (up to 100 million rows)? Is a row in the primary only associated with
one batch?

 

 

Sean Durity – Cassandra Admin, Big Data Team

To engage the team, create a request

 

From: Anuj Wadehra [mailto:anujw_2003@yahoo.co.in] 
Sent: Friday, July 24, 2015 3:57 AM
To: user@cassandra.apache.org
Subject: Re: Manual Indexing With Buckets

 

Can anyone take this one?

 

Thanks

Anuj

Sent from Yahoo Mail on Android

From:"Anuj Wadehra" <anujw_2003@yahoo.co.in>
Date:Thu, 23 Jul, 2015 at 10:57 pm
Subject:Manual Indexing With Buckets

We have a primary table and we need search capability by batchid column. So we are creating
a manual index for search by batch id. We are using buckets to restrict a row size in batch
id index table to 50mb. As batch size may vary drastically ( ie one batch id may be associated
to 100k row keys in primary table while other may be associated with 100million row keys),
we are creating a metadata table to track the approximate data while insertions for a batch
in primary table, so that batch id index table has dynamic no of buckets/rows. As more data
is inserted for a batch in primary table, new set of 10 buckets are added. At any point in
time, clients will write to latest 10 buckets created for a batch od index in round robin
 to avoid hotspots.

 

Comments required on the following:

1. I want to know any suggestios on above design?

 

2. Whats the best approach for updating/deleting from index table. When a row is manually
purged from primary table, we dont know where that row key exists in x number of buckets created
for its batch id? 

 

Thanks

Anuj

Sent from Yahoo Mail on Android

 



The information in this Internet Email is confidential and may be legally privileged. It is
intended solely for the addressee. Access to this Email by anyone else is unauthorized. If
you are not the intended recipient, any disclosure, copying, distribution or any action taken
or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed
to our clients any opinions or advice contained in this Email are subject to the terms and
conditions expressed in any applicable governing The Home Depot terms of business or client
engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy
and content of this attachment and for any damages or losses arising from any inaccuracies,
errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature,
which may be contained in this attachment and shall not be liable for direct, indirect, consequential
or special damages in connection with
 this e-mail message or its attachment.


Mime
View raw message