Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7F4C611668 for ; Wed, 30 Jul 2014 12:01:44 +0000 (UTC) Received: (qmail 93474 invoked by uid 500); 30 Jul 2014 12:01:44 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 93443 invoked by uid 500); 30 Jul 2014 12:01:44 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 93422 invoked by uid 99); 30 Jul 2014 12:01:44 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Jul 2014 12:01:44 +0000 Date: Wed, 30 Jul 2014 12:01:44 +0000 (UTC) From: "Aleksey Yeschenko (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (CASSANDRA-7643) Cassandra Schema Template MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-7643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14079221#comment-14079221 ] Aleksey Yeschenko edited comment on CASSANDRA-7643 at 7/30/14 12:01 PM: ------------------------------------------------------------------------ First, let me agree w/ [~brandon.williams] here. Having a huge number of tables (more than a couple hundred) is not a good idea in C*, and schema is not even the main reason behind it - SlabAllocator reserving 1MB per table is. That said, there is (or should be) a JIRA issue somewhere with a slightly similar and accepted idea - a form of CREATE TABLE to create a duplicate of another table w/ a different name (e.g. CREATE TABLE new_table LIKE old_table). old_table and new_table wouldn't be linked, however, so future changes to old_table wouldn't alter new_table. As for the patch itself - it's definitely too invasive for inclusion into 2.0, and it's too late at this point to include it into 2.1. And as for 3.0, the schema tables are going to be very different there - see CASSANDRA-6717 (which is almost complete). Besides that, I'm also planning to modify schema propagation code to make it a lot more efficient for both schema pulls and pushes. To sum it up: - the earliest this *could* be included into mainline is 3.0 - 3.0 schema tables, and schema code in general will be different enough that the patch would have to be rewritten from scratch - between CREATE TABLE LIKE and generally improved schema code in 3.0, need for this particular feature is not as clear - supporting large numbers of tables is not our explicit design goal (although the schema part of the issue will decrease a lot in 3.0) was (Author: iamaleksey): First, let me agree w/ [~brandon.williams] here. Having a huge number of columns (more than a couple hundred) is not a good idea in C*, and schema is not even the main reason behind it - SlabAllocator reserving 1MB per table is. That said, there is (or should be) a JIRA issue somewhere with a slightly similar and accepted idea - a form of CREATE TABLE to create a duplicate of another table w/ a different name (e.g. CREATE TABLE new_table LIKE old_table). old_table and new_table wouldn't be linked, however, so future changes to old_table wouldn't alter new_table. As for the patch itself - it's definitely too invasive for inclusion into 2.0, and it's too late at this point to include it into 2.1. And as for 3.0, the schema tables are going to be very different there - see CASSANDRA-6717 (which is almost complete). Besides that, I'm also planning to modify schema propagation code to make it a lot more efficient for both schema pulls and pushes. To sum it up: - the earliest this *could* be included into mainline is 3.0 - 3.0 schema tables, and schema code in general will be different enough that the patch would have to be rewritten from scratch - between CREATE TABLE LIKE and generally improved schema code in 3.0, need for this particular feature is not as clear - supporting large numbers of tables is not our explicit design goal (although the schema part of the issue will decrease a lot in 3.0) > Cassandra Schema Template > -------------------------- > > Key: CASSANDRA-7643 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7643 > Project: Cassandra > Issue Type: New Feature > Components: Core > Reporter: Cheng Ren > Priority: Minor > Attachments: patch.diff > > > Cassandra schema change is the performance painpoint for us, since it's the global information across the entire cluster. Our production cassandra cluster consists of a lot of sets of column families, which totals 1000 column families, and 38301 columns, which sum up to 3.2MB. > We have a data model where the primary key is split into two parts K1 , K2. Lets say the cardinality of set K1 is small. We also have a constraint that we frequently want to scan all rows that belong to a particular value of K1. > In this case cassandra offers two possible solutions. > 1) Create a single table with a composite key (K1, K2) > 2) Create a table per K1, with primary key as K2 > In option #1: The number of tables is only 1, however we lose the ability to easily scan all rows in K1= X without paying the penalty of reading all rows in the table. > Option #2 : gives us the freedom to scan only a particular value of K1. However it leads to significant potentially unbounded increase in # of tables. However if the size of set (K1) is relatively small , this is a feasible option with a cleaner data interface. > An example of this data model is where we have a set of merchants with products. Then K1 = merchant_id and K2 = product Id. The number of merchants is still very small compared to # of products. > Option #2 is our solution since size of set k1 for us is relatively small, but also creates a fair amount of tables per K1 which have exactly same columns and metadata, so whenever we need to add/drop one attribute for all of our tables per K1, it puts a lot of loads on the entire cluster, and all backend pipelines will be affected, or even have to be shutdown to accommodate the schema change. > To reduce the load of this kind of schema change, we came up with a new feature called "template". We can create a template, and then create tables with that template. > ex: > {code} > create template template_table ( block_id text, PRIMARY KEY (block_id)); > create table table_a, table_b, table_c with template_table; > {code} > This allows us to reduce the time of metadata gossip. Moreover, when we need to add one more attribute for all of our merchant, we just need to alter template: > {code} > alter template template_table add foo text; > {code} > which also alters table_a, table_b, table_c. > We changed the system keyspace a bit to accommodate the template feature: > schema_columnfamilies only stores the metadata of template and non-templated column families. > schema_columns only stores the column info of template and non-templated cfs. > and we added a new table in system keyspace called schema_columnfamilies_templated, > which manages the mapping relationship between template and templated cfs. > So like this: > schema_columnfamilies_templated: > keyspace, columnfamily_name, template_name > XXX, table_a, template_table > XXX, table_b, template_table > XXX, table_c, template_table > We already have some performance results in our 15-node cluster. Normally creating 400 tables takes more than hours for all the migration stage tasks to complete , but if we create 400 tables with templates, it just takes 1 to 2 seconds. It also works great for alter table. > We believe what we're proposing here can be very useful for other people in the Cassandra community as well. Attached is our proposed patch for the template schema feature. Is it possible for the community to consider accepting this patch in the main branch of latest Cassandra? Or, would you mind providing us feedbacks? Please let us know if you have any concerns or suggestions regarding the change. -- This message was sent by Atlassian JIRA (v6.2#6252)