cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From slebre...@apache.org
Subject svn commit: r1430925 - /cassandra/site/publish/doc/cql3/CQL.html
Date Wed, 09 Jan 2013 16:04:34 GMT
Author: slebresne
Date: Wed Jan  9 16:04:33 2013
New Revision: 1430925

URL: http://svn.apache.org/viewvc?rev=1430925&view=rev
Log:
Fix CQL3 ref doc on website

Modified:
    cassandra/site/publish/doc/cql3/CQL.html

Modified: cassandra/site/publish/doc/cql3/CQL.html
URL: http://svn.apache.org/viewvc/cassandra/site/publish/doc/cql3/CQL.html?rev=1430925&r1=1430924&r2=1430925&view=diff
==============================================================================
--- cassandra/site/publish/doc/cql3/CQL.html (original)
+++ cassandra/site/publish/doc/cql3/CQL.html Wed Jan  9 16:04:33 2013
@@ -87,7 +87,7 @@ CREATE TABLE timeline (
     other text,
     PRIMARY KEY (k)
 )
-</pre></pre><p>Moreover, a table must define at least one column that is
not part of the PRIMARY KEY as a row exists in Cassandra only if it contains at least one
value for one such column.</p><h4 id="createTablepartitionClustering">Partition
key and clustering</h4><p>In CQL, the order in which columns are defined for the
<code>PRIMARY KEY</code> matters. The first column of the key is called the <i>partition
key</i>. It has the property that all the rows sharing the same partition key (even
across table in fact) are stored on the same physical node. Also, insertion/update/deletion
on rows sharing the same partition key for a given table are performed <i>atomically</i>
and in <i>isolation</i>. Note that it is possible to have a composite partition
key, i.e. a partition key formed of multiple columns, using an extra set of parentheses to
define which columns forms the partition key.</p><p>The remaining columns of the
<code>PRIMARY KEY</code> definition, if any, are called <i>
 clustering keys</i>. On a given physical node, rows for a given partition key are stored
in the order induced by the clustering keys, making the retrieval of rows in that clustering
order particularly efficient (see <a href="#selectStmt"><tt>SELECT</tt></a>).</p><h4
id="createTableOptions"><code>&lt;option></code></h4><p>The
<code>CREATE TABLE</code> statement supports a number of options that controls
the configuration of a new table. These options can be specified after the <code>WITH</code>
keyword.</p><p>The first of these option is <code>COMPACT STORAGE</code>.
This option is meanly targeted towards backward compatibility with some table definition created
before CQL3.  But it also provides a slightly more compact layout of data on disk, though
at the price of flexibility and extensibility, and for that reason is not recommended unless
for the backward compatibility reason. The restriction for table with <code>COMPACT
STORAGE</code> is that they support one and only one
  column outside of the ones part of the <code>PRIMARY KEY</code>. It also follows
that columns cannot be added nor removed after creation. A table with <code>COMPACT
STORAGE</code> must also define at least one <a href="createTablepartitionClustering">clustering
key</a>.</p><p>Another option is <code>CLUSTERING ORDER</code>.
It allows to define the ordering of rows on disk. It takes the list of the clustering key
names with, for each of them, the on-disk order (Ascending or descending). Note that this
option affects <a href="#selectOrderBy">what <code>ORDER BY</code> are allowed
during <code>SELECT</code></a>.</p><p>Table creation supports
the following other <code>&lt;property></code>:</p><table><tr><th>option
                   </th><th>kind   </th><th>default   </th><th>description</th></tr><tr><td><code>comment</code>
                   </td><td><em>simple</em> </td><td>none
       </td><td>A free-form, human-readable comment.</td></tr><tr><td><code>read_repair_chance</c
 ode>         </td><td><em>simple</em> </td><td>0.1
        </td><td>The probability with which to query extra nodes (e.g. more nodes
than required by the consistency level) for the purpose of read repairs.</td></tr><tr><td><code>dclocal_read_repair_chance</code>
</td><td><em>simple</em> </td><td>0           </td><td>The
probability with which to query extra nodes (e.g. more nodes than required by the consistency
level) belonging to the same data center than the read coordinator for the purpose of read
repairs.</td></tr><tr><td><code>gc_grace_seconds</code>
          </td><td><em>simple</em> </td><td>864000   
  </td><td>Time to wait before garbage collecting tombstones (deletion markers).</td></tr><tr><td><code>bloom_filter_fp_chance</code>
    </td><td><em>simple</em> </td><td>0.00075     </td><td>The
target probability of false positive of the sstable bloom filters. Said bloom filters will
be sized to provide the provided probability (thus lowering this value impact the si
 ze of bloom filters in-memory and on-disk)</td></tr><tr><td><code>compaction</code>
                </td><td><em>map</em>    </td><td><em>see
below</em> </td><td>The compaction otpions to use, see below.</td></tr><tr><td><code>compression</code>
               </td><td><em>map</em>    </td><td><em>see
below</em> </td><td>Compression options, see below. </td></tr><tr><td><code>replicate_on_write</code>
        </td><td><em>simple</em> </td><td>true       
</td><td>Whether to replicate data on write. This can only be set to false for
tables with counters values. Disabling this is dangerous and can result in random lose of
counters, don&#8217;t disable unless you are sure to know what you are doing</td></tr><tr><td><code>caching</code>
                   </td><td><em>simple</em> </td><td>keys_only
  </td><td>Whether to cache keys (&#8220;key cache&#8221;) and/or rows
(&#8220;row cache&#8221;) for this table. Valid values are: <code>all</code>,
<code>keys_only</code>, <code>rows
 _only</code> and <code>none</code>. </td></tr></table><h4
id="compactionOptions"><code>compaction</code> options</h4><p>The
<code>compaction</code> property must at least define the <code>'class'</code>
sub-option, that defines the compaction strategy class to use. The default supported class
are <code>'SizeTieredCompactionStrategy'</code> and <code>'LeveledCompactionStrategy'</code>.
Custom strategy can be provided by specifying the full class name as a <a href="#constants">string
constant</a>. The rest of the sub-options depends on the chosen class. The sub-options
supported by the default classes are:</p><table><tr><th>option   
                    </th><th>supported compaction strategy </th><th>default
</th><th>description </th></tr><tr><td><code>tombstone_threshold</code>
          </td><td><em>all</em>                           </td><td>0.2
      </td><td>A ratio such that if a sstable has more than this ratio of gcable
tombstones over all contained columns, the sstabl
 e will be compacted (with no other sstables) for the purpose of purging those tombstones.
</td></tr><tr><td><code>tombstone_compaction_interval</code>
</td><td><em>all</em>                           </td><td>1
day     </td><td>The mininum time to wait after an sstable creation time before
considering it for &#8220;tombstone compaction&#8221;, where &#8220;tombstone
compaction&#8221; is the compaction triggered if the sstable has more gcable tombstones
than <code>tombstone_threshold</code>. </td></tr><tr><td><code>min_sstable_size</code>
             </td><td>SizeTieredCompactionStrategy    </td><td>50MB
     </td><td>The size tiered strategy groups SSTables to compact in buckets.
A bucket groups SSTables that differs from less than 50% in size.  However, for small sizes,
this would result in a bucketing that is too fine grained. <code>min_sstable_size</code>
defines a size threshold (in bytes) below which all SSTables belong to one unique bucket</td></tr><tr><td><code>min_co
 mpaction_threshold</code>      </td><td>SizeTieredCompactionStrategy  
 </td><td>4         </td><td>Minimum number of SSTables needed to
start a minor compaction.</td></tr><tr><td><code>max_compaction_threshold</code>
     </td><td>SizeTieredCompactionStrategy    </td><td>32        </td><td>Maximum
number of SSTables processed by one minor compaction.</td></tr><tr><td><code>bucket_low</code>
                   </td><td>SizeTieredCompactionStrategy    </td><td>0.5
      </td><td>Size tiered consider sstables to be within the same bucket if their
size is within [average_size * <code>bucket_low</code>, average_size * <code>bucket_high</code>
] (i.e the default groups sstable whose sizes diverges by at most 50%)</td></tr><tr><td><code>bucket_high</code>
                  </td><td>SizeTieredCompactionStrategy    </td><td>1.5
      </td><td>Size tiered consider sstables to be within the same bucket if their
size is within [average_size * <code>bucket_low</code>, average_size * <co
 de>bucket_high</code> ] (i.e the default groups sstable whose sizes diverges by
at most 50%).</td></tr><tr><td><code>sstable_size_in_mb</code>
           </td><td>LeveledCompactionStrategy       </td><td>5MB 
     </td><td>The target size (in MB) for sstables in the leveled strategy. Note
that while sstable sizes should stay less or equal to <code>sstable_size_in_mb</code>,
it is possible to exceptionally have a larger sstable as during compaction, data for a given
partition key are never split into 2 sstables</td></tr></table><p>For
the <code>compression</code> property, the following default sub-options are available:</p><table><tr><th>option
             </th><th>default        </th><th>description </th></tr><tr><td><code>sstable_compression</code>
</td><td>SnappyCompressor </td><td>The compression algorithm to use.
Default compressor are: SnappyCompressor and DeflateCompressor. Use an empty string (<code>''</code>)
to disable compression. Custom compressor can be provide
 d by specifying the full class name as a <a href="#constants">string constant</a>.</td></tr><tr><td><code>chunk_length_kb</code>
    </td><td>64KB             </td><td>On disk SSTables are compressed
by block (to allow random reads). This defines the size (in KB) of said block. Bigger values
may improve the compression rate, but increases the minimum size of data to be read from disk
for a read </td></tr><tr><td><code>crc_check_chance</code>
   </td><td>1.0              </td><td>When compression is enabled,
each compressed block includes a checksum of that block for the purpose of detecting disk
bitrot and avoiding the propagation of corruption to other replica. This option defines the
probability with which those checksums are checked during read. By default they are always
checked. Set to 0 to disable checksum checking and to 0.5 for instance to check them every
other read</td></tr></table><h4 id="Otherconsiderations">Other considerations:</h4><ul><li>When
<a href="#insert
 Stmt/&quot;updating&quot;:#updateStmt">inserting</a> a given row, not all
columns needs to be defined (except for those part of the key), and missing columns occupy
no space on disk. Furthermore, adding new columns (see &lt;a href=#alterStmt><tt>ALTER
TABLE</tt></a>) is a constant time operation. There is thus no need to try to
anticipate future usage (or to cry when you haven&#8217;t) when creating a table.</li></ul><h3
id="alterTableStmt">ALTER TABLE</h3><p><i>Syntax:</i></p><pre
class="syntax"><pre>&lt;alter-table-stmt> ::= ALTER (TABLE | COLUMNFAMILY)
&lt;tablename> &lt;instruction>
+</pre></pre><p>Moreover, a table must define at least one column that is
not part of the PRIMARY KEY as a row exists in Cassandra only if it contains at least one
value for one such column.</p><h4 id="createTablepartitionClustering">Partition
key and clustering</h4><p>In CQL, the order in which columns are defined for the
<code>PRIMARY KEY</code> matters. The first column of the key is called the <i>partition
key</i>. It has the property that all the rows sharing the same partition key (even
across table in fact) are stored on the same physical node. Also, insertion/update/deletion
on rows sharing the same partition key for a given table are performed <i>atomically</i>
and in <i>isolation</i>. Note that it is possible to have a composite partition
key, i.e. a partition key formed of multiple columns, using an extra set of parentheses to
define which columns forms the partition key.</p><p>The remaining columns of the
<code>PRIMARY KEY</code> definition, if any, are called <i>
 clustering keys</i>. On a given physical node, rows for a given partition key are stored
in the order induced by the clustering keys, making the retrieval of rows in that clustering
order particularly efficient (see <a href="#selectStmt"><tt>SELECT</tt></a>).</p><h4
id="createTableOptions"><code>&lt;option></code></h4><p>The
<code>CREATE TABLE</code> statement supports a number of options that controls
the configuration of a new table. These options can be specified after the <code>WITH</code>
keyword.</p><p>The first of these option is <code>COMPACT STORAGE</code>.
This option is meanly targeted towards backward compatibility with some table definition created
before CQL3.  But it also provides a slightly more compact layout of data on disk, though
at the price of flexibility and extensibility, and for that reason is not recommended unless
for the backward compatibility reason. The restriction for table with <code>COMPACT
STORAGE</code> is that they support one and only one
  column outside of the ones part of the <code>PRIMARY KEY</code>. It also follows
that columns cannot be added nor removed after creation. A table with <code>COMPACT
STORAGE</code> must also define at least one <a href="createTablepartitionClustering">clustering
key</a>.</p><p>Another option is <code>CLUSTERING ORDER</code>.
It allows to define the ordering of rows on disk. It takes the list of the clustering key
names with, for each of them, the on-disk order (Ascending or descending). Note that this
option affects <a href="#selectOrderBy">what <code>ORDER BY</code> are allowed
during <code>SELECT</code></a>.</p><p>Table creation supports
the following other <code>&lt;property></code>:</p><table><tr><th>option
                   </th><th>kind   </th><th>default   </th><th>description</th></tr><tr><td><code>comment</code>
                   </td><td><em>simple</em> </td><td>none
       </td><td>A free-form, human-readable comment.</td></tr><tr><td><code>read_repair_chance</c
 ode>         </td><td><em>simple</em> </td><td>0.1
        </td><td>The probability with which to query extra nodes (e.g. more nodes
than required by the consistency level) for the purpose of read repairs.</td></tr><tr><td><code>dclocal_read_repair_chance</code>
</td><td><em>simple</em> </td><td>0           </td><td>The
probability with which to query extra nodes (e.g. more nodes than required by the consistency
level) belonging to the same data center than the read coordinator for the purpose of read
repairs.</td></tr><tr><td><code>gc_grace_seconds</code>
          </td><td><em>simple</em> </td><td>864000   
  </td><td>Time to wait before garbage collecting tombstones (deletion markers).</td></tr><tr><td><code>bloom_filter_fp_chance</code>
    </td><td><em>simple</em> </td><td>0.00075     </td><td>The
target probability of false positive of the sstable bloom filters. Said bloom filters will
be sized to provide the provided probability (thus lowering this value impact the si
 ze of bloom filters in-memory and on-disk)</td></tr><tr><td><code>compaction</code>
                </td><td><em>map</em>    </td><td><em>see
below</em> </td><td>The compaction otpions to use, see below.</td></tr><tr><td><code>compression</code>
               </td><td><em>map</em>    </td><td><em>see
below</em> </td><td>Compression options, see below. </td></tr><tr><td><code>replicate_on_write</code>
        </td><td><em>simple</em> </td><td>true       
</td><td>Whether to replicate data on write. This can only be set to false for
tables with counters values. Disabling this is dangerous and can result in random lose of
counters, don&#8217;t disable unless you are sure to know what you are doing</td></tr><tr><td><code>caching</code>
                   </td><td><em>simple</em> </td><td>keys_only
  </td><td>Whether to cache keys (&#8220;key cache&#8221;) and/or rows
(&#8220;row cache&#8221;) for this table. Valid values are: <code>all</code>,
<code>keys_only</code>, <code>rows
 _only</code> and <code>none</code>. </td></tr></table><h4
id="compactionOptions"><code>compaction</code> options</h4><p>The
<code>compaction</code> property must at least define the <code>'class'</code>
sub-option, that defines the compaction strategy class to use. The default supported class
are <code>'SizeTieredCompactionStrategy'</code> and <code>'LeveledCompactionStrategy'</code>.
Custom strategy can be provided by specifying the full class name as a <a href="#constants">string
constant</a>. The rest of the sub-options depends on the chosen class. The sub-options
supported by the default classes are:</p><table><tr><th>option   
                    </th><th>supported compaction strategy </th><th>default
</th><th>description </th></tr><tr><td><code>tombstone_threshold</code>
          </td><td><em>all</em>                           </td><td>0.2
      </td><td>A ratio such that if a sstable has more than this ratio of gcable
tombstones over all contained columns, the sstabl
 e will be compacted (with no other sstables) for the purpose of purging those tombstones.
</td></tr><tr><td><code>tombstone_compaction_interval</code>
</td><td><em>all</em>                           </td><td>1
day     </td><td>The mininum time to wait after an sstable creation time before
considering it for &#8220;tombstone compaction&#8221;, where &#8220;tombstone
compaction&#8221; is the compaction triggered if the sstable has more gcable tombstones
than <code>tombstone_threshold</code>. </td></tr><tr><td><code>min_sstable_size</code>
             </td><td>SizeTieredCompactionStrategy    </td><td>50MB
     </td><td>The size tiered strategy groups SSTables to compact in buckets.
A bucket groups SSTables that differs from less than 50% in size.  However, for small sizes,
this would result in a bucketing that is too fine grained. <code>min_sstable_size</code>
defines a size threshold (in bytes) below which all SSTables belong to one unique bucket</td></tr><tr><td><code>min_th
 reshold</code>                 </td><td>SizeTieredCompactionStrategy  
 </td><td>4         </td><td>Minimum number of SSTables needed to
start a minor compaction.</td></tr><tr><td><code>max_threshold</code>
                </td><td>SizeTieredCompactionStrategy    </td><td>32
       </td><td>Maximum number of SSTables processed by one minor compaction.</td></tr><tr><td><code>bucket_low</code>
                   </td><td>SizeTieredCompactionStrategy    </td><td>0.5
      </td><td>Size tiered consider sstables to be within the same bucket if their
size is within [average_size * <code>bucket_low</code>, average_size * <code>bucket_high</code>
] (i.e the default groups sstable whose sizes diverges by at most 50%)</td></tr><tr><td><code>bucket_high</code>
                  </td><td>SizeTieredCompactionStrategy    </td><td>1.5
      </td><td>Size tiered consider sstables to be within the same bucket if their
size is within [average_size * <code>bucket_low</code>, average_size * <co
 de>bucket_high</code> ] (i.e the default groups sstable whose sizes diverges by
at most 50%).</td></tr><tr><td><code>sstable_size_in_mb</code>
           </td><td>LeveledCompactionStrategy       </td><td>5MB 
     </td><td>The target size (in MB) for sstables in the leveled strategy. Note
that while sstable sizes should stay less or equal to <code>sstable_size_in_mb</code>,
it is possible to exceptionally have a larger sstable as during compaction, data for a given
partition key are never split into 2 sstables</td></tr></table><p>For
the <code>compression</code> property, the following default sub-options are available:</p><table><tr><th>option
             </th><th>default        </th><th>description </th></tr><tr><td><code>sstable_compression</code>
</td><td>SnappyCompressor </td><td>The compression algorithm to use.
Default compressor are: SnappyCompressor and DeflateCompressor. Use an empty string (<code>''</code>)
to disable compression. Custom compressor can be provide
 d by specifying the full class name as a <a href="#constants">string constant</a>.</td></tr><tr><td><code>chunk_length_kb</code>
    </td><td>64KB             </td><td>On disk SSTables are compressed
by block (to allow random reads). This defines the size (in KB) of said block. Bigger values
may improve the compression rate, but increases the minimum size of data to be read from disk
for a read </td></tr><tr><td><code>crc_check_chance</code>
   </td><td>1.0              </td><td>When compression is enabled,
each compressed block includes a checksum of that block for the purpose of detecting disk
bitrot and avoiding the propagation of corruption to other replica. This option defines the
probability with which those checksums are checked during read. By default they are always
checked. Set to 0 to disable checksum checking and to 0.5 for instance to check them every
other read</td></tr></table><h4 id="Otherconsiderations">Other considerations:</h4><ul><li>When
<a href="#insert
 Stmt/&quot;updating&quot;:#updateStmt">inserting</a> a given row, not all
columns needs to be defined (except for those part of the key), and missing columns occupy
no space on disk. Furthermore, adding new columns (see &lt;a href=#alterStmt><tt>ALTER
TABLE</tt></a>) is a constant time operation. There is thus no need to try to
anticipate future usage (or to cry when you haven&#8217;t) when creating a table.</li></ul><h3
id="alterTableStmt">ALTER TABLE</h3><p><i>Syntax:</i></p><pre
class="syntax"><pre>&lt;alter-table-stmt> ::= ALTER (TABLE | COLUMNFAMILY)
&lt;tablename> &lt;instruction>
 
 &lt;instruction> ::= ALTER &lt;identifier> TYPE &lt;type>
                 | ADD   &lt;identifier> &lt;type>



Mime
View raw message