cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From slebre...@apache.org
Subject [19/34] cassandra git commit: Reorganize document
Date Mon, 27 Jun 2016 18:34:14 GMT
http://git-wip-us.apache.org/repos/asf/cassandra/blob/54f7335c/doc/source/getting_started.rst
----------------------------------------------------------------------
diff --git a/doc/source/getting_started.rst b/doc/source/getting_started.rst
deleted file mode 100644
index b20376a..0000000
--- a/doc/source/getting_started.rst
+++ /dev/null
@@ -1,269 +0,0 @@
-.. Licensed to the Apache Software Foundation (ASF) under one
-.. or more contributor license agreements.  See the NOTICE file
-.. distributed with this work for additional information
-.. regarding copyright ownership.  The ASF licenses this file
-.. to you under the Apache License, Version 2.0 (the
-.. "License"); you may not use this file except in compliance
-.. with the License.  You may obtain a copy of the License at
-..
-..     http://www.apache.org/licenses/LICENSE-2.0
-..
-.. Unless required by applicable law or agreed to in writing, software
-.. distributed under the License is distributed on an "AS IS" BASIS,
-.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-.. See the License for the specific language governing permissions and
-.. limitations under the License.
-
-.. highlight:: none
-
-Getting Started
-===============
-
-Installing Cassandra
---------------------
-
-Prerequisites
-^^^^^^^^^^^^^
-
-- The latest version of Java 8, either the `Oracle Java Standard Edition 8
-  <http://www.oracle.com/technetwork/java/javase/downloads/index.html>`__ or `OpenJDK 8 <http://openjdk.java.net/>`__. To
-  verify that you have the correct version of java installed, type ``java -version``.
-
-- For using cqlsh, the latest version of `Python 2.7 <https://www.python.org/downloads/>`__. To verify that you have
-  the correct version of Python installed, type ``python --version``.
-
-Installation from binary tarball files
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-- Download the latest stable release from the `Apache Cassandra downloads website <http://cassandra.apache.org/download/>`__.
-
-- Untar the file somewhere, for example:
-
-::
-
-    tar -xvf apache-cassandra-3.6-bin.tar.gz cassandra
-
-The files will be extracted into ``apache-cassandra-3.6``, you need to substitute 3.6 with the release number that you
-have downloaded.
-
-- Optionally add ``apache-cassandra-3.6\bin`` to your path.
-- Start Cassandra in the foreground by invoking ``bin/cassandra -f`` from the command line. Press "Control-C" to stop
-  Cassandra. Start Cassandra in the background by invoking ``bin/cassandra`` from the command line. Invoke ``kill pid``
-  or ``pkill -f CassandraDaemon`` to stop Cassandra, where pid is the Cassandra process id, which you can find for
-  example by invoking ``pgrep -f CassandraDaemon``.
-- Verify that Cassandra is running by invoking ``bin/nodetool status`` from the command line.
-- Configuration files are located in the ``conf`` sub-directory.
-- Since Cassandra 2.1, log and data directories are located in the ``logs`` and ``data`` sub-directories respectively.
-  Older versions defaulted to ``/var/log/cassandra`` and ``/var/lib/cassandra``. Due to this, it is necessary to either
-  start Cassandra with root privileges or change ``conf/cassandra.yaml`` to use directories owned by the current user,
-  as explained below in the section on changing the location of directories.
-
-Installation from Debian packages
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-- Add the Apache repository of Cassandra to ``/etc/apt/sources.list.d/cassandra.sources.list``, for example for version
-  3.6:
-
-::
-
-    echo "deb http://www.apache.org/dist/cassandra/debian 36x main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list
-
-- Update the repositories:
-
-::
-
-    sudo apt-get update
-
-- If you encounter this error:
-
-::
-
-    GPG error: http://www.apache.org 36x InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 749D6EEC0353B12C
-
-Then add the public key 749D6EEC0353B12C as follows:
-
-::
-
-    gpg --keyserver pgp.mit.edu --recv-keys 749D6EEC0353B12C
-    gpg --export --armor 749D6EEC0353B12C | sudo apt-key add -
-
-and repeat ``sudo apt-get update``. The actual key may be different, you get it from the error message itself. For a
-full list of Apache contributors public keys, you can refer to `this link <https://www.apache.org/dist/cassandra/KEYS>`__.
-
-- Install Cassandra:
-
-::
-
-    sudo apt-get install cassandra
-
-- You can start Cassandra with ``sudo service cassandra start`` and stop it with ``sudo service cassandra stop``.
-  However, normally the service will start automatically. For this reason be sure to stop it if you need to make any
-  configuration changes.
-- Verify that Cassandra is running by invoking ``nodetool status`` from the command line.
-- The default location of configuration files is ``/etc/cassandra``.
-- The default location of log and data directories is ``/var/log/cassandra/`` and ``/var/lib/cassandra``.
-
-Configuring Cassandra
----------------------
-
-For running Cassandra on a single node, the steps above are enough, you don't really need to change any configuration.
-However, when you deploy a cluster of nodes, or use clients that are not on the same host, then there are some
-parameters that must be changed.
-
-The Cassandra configuration files can be found in the ``conf`` directory of tarballs. For packages, the configuration
-files will be located in ``/etc/cassandra``.
-
-Main runtime properties
-^^^^^^^^^^^^^^^^^^^^^^^
-
-Most of configuration in Cassandra is done via yaml properties that can be set in ``cassandra.yaml``. At a minimum you
-should consider setting the following properties:
-
-- ``cluster_name``: the name of your cluster.
-- ``seeds``: a comma separated list of the IP addresses of your cluster seeds.
-- ``storage_port``: you don't necessarily need to change this but make sure that there are no firewalls blocking this
-  port.
-- ``listen_address``: the IP address of your node, this is what allows other nodes to communicate with this node so it
-  is important that you change it. Alternatively, you can set ``listen_interface`` to tell Cassandra which interface to
-  use, and consecutively which address to use. Set only one, not both.
-- ``native_transport_port``: as for storage\_port, make sure this port is not blocked by firewalls as clients will
-  communicate with Cassandra on this port.
-
-Changing the location of directories
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-The following yaml properties control the location of directories:
-
-- ``data_file_directories``: one or more directories where data files are located.
-- ``commitlog_directory``: the directory where commitlog files are located.
-- ``saved_caches_directory``: the directory where saved caches are located.
-- ``hints_directory``: the directory where hints are located.
-
-For performance reasons, if you have multiple disks, consider putting commitlog and data files on different disks.
-
-Environment variables
-^^^^^^^^^^^^^^^^^^^^^
-
-JVM-level settings such as heap size can be set in ``cassandra-env.sh``.  You can add any additional JVM command line
-argument to the ``JVM_OPTS`` environment variable; when Cassandra starts these arguments will be passed to the JVM.
-
-Logging
-^^^^^^^
-
-The logger in use is logback. You can change logging properties by editing ``logback.xml``. By default it will log at
-INFO level into a file called ``system.log`` and at debug level into a file called ``debug.log``. When running in the
-foreground, it will also log at INFO level to the console.
-
-
-cqlsh
------
-
-cqlsh is a command line shell for interacting with Cassandra through CQL (the Cassandra Query Language). It is shipped
-with every Cassandra package, and can be found in the bin/ directory alongside the cassandra executable. cqlsh utilizes
-the Python native protocol driver, and connects to the single node specified on the command line. For example::
-
-    $ bin/cqlsh localhost
-    Connected to Test Cluster at localhost:9042.
-    [cqlsh 5.0.1 | Cassandra 3.8 | CQL spec 3.4.2 | Native protocol v4]
-    Use HELP for help.
-    cqlsh> SELECT cluster_name, listen_address FROM system.local;
-
-     cluster_name | listen_address
-    --------------+----------------
-     Test Cluster |      127.0.0.1
-
-    (1 rows)
-    cqlsh>
-
-
-See the :ref:`cqlsh section <cqlsh>` for full documentation.
-
-Cassandra client drivers
-------------------------
-
-Here are known Cassandra client drivers organized by language. Before choosing a driver, you should verify the Cassandra
-version and functionality supported by a specific driver.
-
-Java
-^^^^
-
-- `Achilles <http://achilles.archinnov.info/>`__
-- `Astyanax <https://github.com/Netflix/astyanax/wiki/Getting-Started>`__
-- `Casser <https://github.com/noorq/casser>`__
-- `Datastax Java driver <https://github.com/datastax/java-driver>`__
-- `Kundera <https://github.com/impetus-opensource/Kundera>`__
-- `PlayORM <https://github.com/deanhiller/playorm>`__
-
-Python
-^^^^^^
-
-- `Datastax Python driver <https://github.com/datastax/python-driver>`__
-
-Ruby
-^^^^
-
-- `Datastax Ruby driver <https://github.com/datastax/ruby-driver>`__
-
-C# / .NET
-^^^^^^^^^
-
-- `Cassandra Sharp <https://github.com/pchalamet/cassandra-sharp>`__
-- `Datastax C# driver <https://github.com/datastax/csharp-driver>`__
-- `Fluent Cassandra <https://github.com/managedfusion/fluentcassandra>`__
-
-Nodejs
-^^^^^^
-
-- `Datastax Nodejs driver <https://github.com/datastax/nodejs-driver>`__
-- `Node-Cassandra-CQL <https://github.com/jorgebay/node-cassandra-cql>`__
-
-PHP
-^^^
-
-- `CQL \| PHP <http://code.google.com/a/apache-extras.org/p/cassandra-pdo>`__
-- `Datastax PHP driver <https://github.com/datastax/php-driver/>`__
-- `PHP-Cassandra <https://github.com/aparkhomenko/php-cassandra>`__
-- `PHP Library for Cassandra <http://evseevnn.github.io/php-cassandra-binary/>`__
-
-C++
-^^^
-
-- `Datastax C++ driver <https://github.com/datastax/cpp-driver>`__
-- `libQTCassandra <http://sourceforge.net/projects/libqtcassandra>`__
-
-Scala
-^^^^^
-
-- `Datastax Spark connector <https://github.com/datastax/spark-cassandra-connector>`__
-- `Phantom <https://github.com/newzly/phantom>`__
-- `Quill <https://github.com/getquill/quill>`__
-
-Clojure
-^^^^^^^
-
-- `Alia <https://github.com/mpenet/alia>`__
-- `Cassaforte <https://github.com/clojurewerkz/cassaforte>`__
-- `Hayt <https://github.com/mpenet/hayt>`__
-
-Erlang
-^^^^^^
-
-- `CQerl <https://github.com/matehat/cqerl>`__
-- `Erlcass <https://github.com/silviucpp/erlcass>`__
-
-Go
-^^
-
-- `CQLc <http://relops.com/cqlc/>`__
-- `Gocassa <https://github.com/hailocab/gocassa>`__
-- `GoCQL <https://github.com/gocql/gocql>`__
-
-Haskell
-^^^^^^^
-
-- `Cassy <https://github.com/ozataman/cassy>`__
-
-Rust
-^^^^
-
-- `Rust CQL <https://github.com/neich/rust-cql>`__

http://git-wip-us.apache.org/repos/asf/cassandra/blob/54f7335c/doc/source/getting_started/configuring.rst
----------------------------------------------------------------------
diff --git a/doc/source/getting_started/configuring.rst b/doc/source/getting_started/configuring.rst
new file mode 100644
index 0000000..27fac78
--- /dev/null
+++ b/doc/source/getting_started/configuring.rst
@@ -0,0 +1,67 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..     http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing, software
+.. distributed under the License is distributed on an "AS IS" BASIS,
+.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+.. See the License for the specific language governing permissions and
+.. limitations under the License.
+
+Configuring Cassandra
+---------------------
+
+For running Cassandra on a single node, the steps above are enough, you don't really need to change any configuration.
+However, when you deploy a cluster of nodes, or use clients that are not on the same host, then there are some
+parameters that must be changed.
+
+The Cassandra configuration files can be found in the ``conf`` directory of tarballs. For packages, the configuration
+files will be located in ``/etc/cassandra``.
+
+Main runtime properties
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Most of configuration in Cassandra is done via yaml properties that can be set in ``cassandra.yaml``. At a minimum you
+should consider setting the following properties:
+
+- ``cluster_name``: the name of your cluster.
+- ``seeds``: a comma separated list of the IP addresses of your cluster seeds.
+- ``storage_port``: you don't necessarily need to change this but make sure that there are no firewalls blocking this
+  port.
+- ``listen_address``: the IP address of your node, this is what allows other nodes to communicate with this node so it
+  is important that you change it. Alternatively, you can set ``listen_interface`` to tell Cassandra which interface to
+  use, and consecutively which address to use. Set only one, not both.
+- ``native_transport_port``: as for storage\_port, make sure this port is not blocked by firewalls as clients will
+  communicate with Cassandra on this port.
+
+Changing the location of directories
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The following yaml properties control the location of directories:
+
+- ``data_file_directories``: one or more directories where data files are located.
+- ``commitlog_directory``: the directory where commitlog files are located.
+- ``saved_caches_directory``: the directory where saved caches are located.
+- ``hints_directory``: the directory where hints are located.
+
+For performance reasons, if you have multiple disks, consider putting commitlog and data files on different disks.
+
+Environment variables
+^^^^^^^^^^^^^^^^^^^^^
+
+JVM-level settings such as heap size can be set in ``cassandra-env.sh``.  You can add any additional JVM command line
+argument to the ``JVM_OPTS`` environment variable; when Cassandra starts these arguments will be passed to the JVM.
+
+Logging
+^^^^^^^
+
+The logger in use is logback. You can change logging properties by editing ``logback.xml``. By default it will log at
+INFO level into a file called ``system.log`` and at debug level into a file called ``debug.log``. When running in the
+foreground, it will also log at INFO level to the console.
+

http://git-wip-us.apache.org/repos/asf/cassandra/blob/54f7335c/doc/source/getting_started/drivers.rst
----------------------------------------------------------------------
diff --git a/doc/source/getting_started/drivers.rst b/doc/source/getting_started/drivers.rst
new file mode 100644
index 0000000..d637a70
--- /dev/null
+++ b/doc/source/getting_started/drivers.rst
@@ -0,0 +1,105 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..     http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing, software
+.. distributed under the License is distributed on an "AS IS" BASIS,
+.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+.. See the License for the specific language governing permissions and
+.. limitations under the License.
+
+Client drivers
+--------------
+
+Here are known Cassandra client drivers organized by language. Before choosing a driver, you should verify the Cassandra
+version and functionality supported by a specific driver.
+
+Java
+^^^^
+
+- `Achilles <http://achilles.archinnov.info/>`__
+- `Astyanax <https://github.com/Netflix/astyanax/wiki/Getting-Started>`__
+- `Casser <https://github.com/noorq/casser>`__
+- `Datastax Java driver <https://github.com/datastax/java-driver>`__
+- `Kundera <https://github.com/impetus-opensource/Kundera>`__
+- `PlayORM <https://github.com/deanhiller/playorm>`__
+
+Python
+^^^^^^
+
+- `Datastax Python driver <https://github.com/datastax/python-driver>`__
+
+Ruby
+^^^^
+
+- `Datastax Ruby driver <https://github.com/datastax/ruby-driver>`__
+
+C# / .NET
+^^^^^^^^^
+
+- `Cassandra Sharp <https://github.com/pchalamet/cassandra-sharp>`__
+- `Datastax C# driver <https://github.com/datastax/csharp-driver>`__
+- `Fluent Cassandra <https://github.com/managedfusion/fluentcassandra>`__
+
+Nodejs
+^^^^^^
+
+- `Datastax Nodejs driver <https://github.com/datastax/nodejs-driver>`__
+- `Node-Cassandra-CQL <https://github.com/jorgebay/node-cassandra-cql>`__
+
+PHP
+^^^
+
+- `CQL \| PHP <http://code.google.com/a/apache-extras.org/p/cassandra-pdo>`__
+- `Datastax PHP driver <https://github.com/datastax/php-driver/>`__
+- `PHP-Cassandra <https://github.com/aparkhomenko/php-cassandra>`__
+- `PHP Library for Cassandra <http://evseevnn.github.io/php-cassandra-binary/>`__
+
+C++
+^^^
+
+- `Datastax C++ driver <https://github.com/datastax/cpp-driver>`__
+- `libQTCassandra <http://sourceforge.net/projects/libqtcassandra>`__
+
+Scala
+^^^^^
+
+- `Datastax Spark connector <https://github.com/datastax/spark-cassandra-connector>`__
+- `Phantom <https://github.com/newzly/phantom>`__
+- `Quill <https://github.com/getquill/quill>`__
+
+Clojure
+^^^^^^^
+
+- `Alia <https://github.com/mpenet/alia>`__
+- `Cassaforte <https://github.com/clojurewerkz/cassaforte>`__
+- `Hayt <https://github.com/mpenet/hayt>`__
+
+Erlang
+^^^^^^
+
+- `CQerl <https://github.com/matehat/cqerl>`__
+- `Erlcass <https://github.com/silviucpp/erlcass>`__
+
+Go
+^^
+
+- `CQLc <http://relops.com/cqlc/>`__
+- `Gocassa <https://github.com/hailocab/gocassa>`__
+- `GoCQL <https://github.com/gocql/gocql>`__
+
+Haskell
+^^^^^^^
+
+- `Cassy <https://github.com/ozataman/cassy>`__
+
+Rust
+^^^^
+
+- `Rust CQL <https://github.com/neich/rust-cql>`__

http://git-wip-us.apache.org/repos/asf/cassandra/blob/54f7335c/doc/source/getting_started/index.rst
----------------------------------------------------------------------
diff --git a/doc/source/getting_started/index.rst b/doc/source/getting_started/index.rst
new file mode 100644
index 0000000..4ca9c4d
--- /dev/null
+++ b/doc/source/getting_started/index.rst
@@ -0,0 +1,33 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..     http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing, software
+.. distributed under the License is distributed on an "AS IS" BASIS,
+.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+.. See the License for the specific language governing permissions and
+.. limitations under the License.
+
+.. highlight:: none
+
+Getting Started
+===============
+
+This section covers how to get started using Apache Cassandra and should be the first thing to read if you are new to
+Cassandra.
+
+.. toctree::
+   :maxdepth: 2
+
+   installing
+   configuring
+   querying
+   drivers
+
+

http://git-wip-us.apache.org/repos/asf/cassandra/blob/54f7335c/doc/source/getting_started/installing.rst
----------------------------------------------------------------------
diff --git a/doc/source/getting_started/installing.rst b/doc/source/getting_started/installing.rst
new file mode 100644
index 0000000..ad0a1e8
--- /dev/null
+++ b/doc/source/getting_started/installing.rst
@@ -0,0 +1,99 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..     http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing, software
+.. distributed under the License is distributed on an "AS IS" BASIS,
+.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+.. See the License for the specific language governing permissions and
+.. limitations under the License.
+
+Installing Cassandra
+--------------------
+
+Prerequisites
+^^^^^^^^^^^^^
+
+- The latest version of Java 8, either the `Oracle Java Standard Edition 8
+  <http://www.oracle.com/technetwork/java/javase/downloads/index.html>`__ or `OpenJDK 8 <http://openjdk.java.net/>`__. To
+  verify that you have the correct version of java installed, type ``java -version``.
+
+- For using cqlsh, the latest version of `Python 2.7 <https://www.python.org/downloads/>`__. To verify that you have
+  the correct version of Python installed, type ``python --version``.
+
+Installation from binary tarball files
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+- Download the latest stable release from the `Apache Cassandra downloads website <http://cassandra.apache.org/download/>`__.
+
+- Untar the file somewhere, for example:
+
+::
+
+    tar -xvf apache-cassandra-3.6-bin.tar.gz cassandra
+
+The files will be extracted into ``apache-cassandra-3.6``, you need to substitute 3.6 with the release number that you
+have downloaded.
+
+- Optionally add ``apache-cassandra-3.6\bin`` to your path.
+- Start Cassandra in the foreground by invoking ``bin/cassandra -f`` from the command line. Press "Control-C" to stop
+  Cassandra. Start Cassandra in the background by invoking ``bin/cassandra`` from the command line. Invoke ``kill pid``
+  or ``pkill -f CassandraDaemon`` to stop Cassandra, where pid is the Cassandra process id, which you can find for
+  example by invoking ``pgrep -f CassandraDaemon``.
+- Verify that Cassandra is running by invoking ``bin/nodetool status`` from the command line.
+- Configuration files are located in the ``conf`` sub-directory.
+- Since Cassandra 2.1, log and data directories are located in the ``logs`` and ``data`` sub-directories respectively.
+  Older versions defaulted to ``/var/log/cassandra`` and ``/var/lib/cassandra``. Due to this, it is necessary to either
+  start Cassandra with root privileges or change ``conf/cassandra.yaml`` to use directories owned by the current user,
+  as explained below in the section on changing the location of directories.
+
+Installation from Debian packages
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+- Add the Apache repository of Cassandra to ``/etc/apt/sources.list.d/cassandra.sources.list``, for example for version
+  3.6:
+
+::
+
+    echo "deb http://www.apache.org/dist/cassandra/debian 36x main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list
+
+- Update the repositories:
+
+::
+
+    sudo apt-get update
+
+- If you encounter this error:
+
+::
+
+    GPG error: http://www.apache.org 36x InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 749D6EEC0353B12C
+
+Then add the public key 749D6EEC0353B12C as follows:
+
+::
+
+    gpg --keyserver pgp.mit.edu --recv-keys 749D6EEC0353B12C
+    gpg --export --armor 749D6EEC0353B12C | sudo apt-key add -
+
+and repeat ``sudo apt-get update``. The actual key may be different, you get it from the error message itself. For a
+full list of Apache contributors public keys, you can refer to `this link <https://www.apache.org/dist/cassandra/KEYS>`__.
+
+- Install Cassandra:
+
+::
+
+    sudo apt-get install cassandra
+
+- You can start Cassandra with ``sudo service cassandra start`` and stop it with ``sudo service cassandra stop``.
+  However, normally the service will start automatically. For this reason be sure to stop it if you need to make any
+  configuration changes.
+- Verify that Cassandra is running by invoking ``nodetool status`` from the command line.
+- The default location of configuration files is ``/etc/cassandra``.
+- The default location of log and data directories is ``/var/log/cassandra/`` and ``/var/lib/cassandra``.

http://git-wip-us.apache.org/repos/asf/cassandra/blob/54f7335c/doc/source/getting_started/querying.rst
----------------------------------------------------------------------
diff --git a/doc/source/getting_started/querying.rst b/doc/source/getting_started/querying.rst
new file mode 100644
index 0000000..f73c106
--- /dev/null
+++ b/doc/source/getting_started/querying.rst
@@ -0,0 +1,38 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..     http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing, software
+.. distributed under the License is distributed on an "AS IS" BASIS,
+.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+.. See the License for the specific language governing permissions and
+.. limitations under the License.
+
+Inserting and querying
+----------------------
+
+cqlsh is a command line shell for interacting with Cassandra through CQL (the Cassandra Query Language). It is shipped
+with every Cassandra package, and can be found in the bin/ directory alongside the cassandra executable. cqlsh utilizes
+the Python native protocol driver, and connects to the single node specified on the command line. For example::
+
+    $ bin/cqlsh localhost
+    Connected to Test Cluster at localhost:9042.
+    [cqlsh 5.0.1 | Cassandra 3.8 | CQL spec 3.4.2 | Native protocol v4]
+    Use HELP for help.
+    cqlsh> SELECT cluster_name, listen_address FROM system.local;
+
+     cluster_name | listen_address
+    --------------+----------------
+     Test Cluster |      127.0.0.1
+
+    (1 rows)
+    cqlsh>
+
+
+See the :ref:`cqlsh section <cqlsh>` for full documentation.

http://git-wip-us.apache.org/repos/asf/cassandra/blob/54f7335c/doc/source/index.rst
----------------------------------------------------------------------
diff --git a/doc/source/index.rst b/doc/source/index.rst
index 16f1323..ec27f5a 100644
--- a/doc/source/index.rst
+++ b/doc/source/index.rst
@@ -26,12 +26,15 @@ Contents:
 .. toctree::
    :maxdepth: 2
 
-   getting_started
-   architecture
-   cql
-   cqlsh
-   operations
-   cassandra_config_file
-   troubleshooting
-   faq
+   getting_started/index
+   architecture/index
+   data_modeling/index
+   cql/index
+   configuration/index
+   operating/index
+   tools/index
+   troubleshooting/index
+   faq/index
+
+   bugs
    contactus

http://git-wip-us.apache.org/repos/asf/cassandra/blob/54f7335c/doc/source/operating/backups.rst
----------------------------------------------------------------------
diff --git a/doc/source/operating/backups.rst b/doc/source/operating/backups.rst
new file mode 100644
index 0000000..c071e83
--- /dev/null
+++ b/doc/source/operating/backups.rst
@@ -0,0 +1,22 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..     http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing, software
+.. distributed under the License is distributed on an "AS IS" BASIS,
+.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+.. See the License for the specific language governing permissions and
+.. limitations under the License.
+
+.. highlight:: none
+
+Backups
+=======
+
+.. todo:: TODO

http://git-wip-us.apache.org/repos/asf/cassandra/blob/54f7335c/doc/source/operating/bloom_filters.rst
----------------------------------------------------------------------
diff --git a/doc/source/operating/bloom_filters.rst b/doc/source/operating/bloom_filters.rst
new file mode 100644
index 0000000..0b37c18
--- /dev/null
+++ b/doc/source/operating/bloom_filters.rst
@@ -0,0 +1,65 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..     http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing, software
+.. distributed under the License is distributed on an "AS IS" BASIS,
+.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+.. See the License for the specific language governing permissions and
+.. limitations under the License.
+
+.. highlight:: none
+
+Bloom Filters
+-------------
+
+In the read path, Cassandra merges data on disk (in SSTables) with data in RAM (in memtables). To avoid checking every
+SSTable data file for the partition being requested, Cassandra employs a data structure known as a bloom filter.
+
+Bloom filters are a probabilistic data structure that allows Cassandra to determine one of two possible states: - The
+data definitely does not exist in the given file, or - The data probably exists in the given file.
+
+While bloom filters can not guarantee that the data exists in a given SSTable, bloom filters can be made more accurate
+by allowing them to consume more RAM. Operators have the opportunity to tune this behavior per table by adjusting the
+the ``bloom_filter_fp_chance`` to a float between 0 and 1.
+
+The default value for ``bloom_filter_fp_chance`` is 0.1 for tables using LeveledCompactionStrategy and 0.01 for all
+other cases.
+
+Bloom filters are stored in RAM, but are stored offheap, so operators should not consider bloom filters when selecting
+the maximum heap size.  As accuracy improves (as the ``bloom_filter_fp_chance`` gets closer to 0), memory usage
+increases non-linearly - the bloom filter for ``bloom_filter_fp_chance = 0.01`` will require about three times as much
+memory as the same table with ``bloom_filter_fp_chance = 0.1``.
+
+Typical values for ``bloom_filter_fp_chance`` are usually between 0.01 (1%) to 0.1 (10%) false-positive chance, where
+Cassandra may scan an SSTable for a row, only to find that it does not exist on the disk. The parameter should be tuned
+by use case:
+
+- Users with more RAM and slower disks may benefit from setting the ``bloom_filter_fp_chance`` to a numerically lower
+  number (such as 0.01) to avoid excess IO operations
+- Users with less RAM, more dense nodes, or very fast disks may tolerate a higher ``bloom_filter_fp_chance`` in order to
+  save RAM at the expense of excess IO operations
+- In workloads that rarely read, or that only perform reads by scanning the entire data set (such as analytics
+  workloads), setting the ``bloom_filter_fp_chance`` to a much higher number is acceptable.
+
+Changing
+^^^^^^^^
+
+The bloom filter false positive chance is visible in the ``DESCRIBE TABLE`` output as the field
+``bloom_filter_fp_chance``. Operators can change the value with an ``ALTER TABLE`` statement:
+::
+
+    ALTER TABLE keyspace.table WITH bloom_filter_fp_chance=0.01
+
+Operators should be aware, however, that this change is not immediate: the bloom filter is calculated when the file is
+written, and persisted on disk as the Filter component of the SSTable. Upon issuing an ``ALTER TABLE`` statement, new
+files on disk will be written with the new ``bloom_filter_fp_chance``, but existing sstables will not be modified until
+they are compacted - if an operator needs a change to ``bloom_filter_fp_chance`` to take effect, they can trigger an
+SSTable rewrite using ``nodetool scrub`` or ``nodetool upgradesstables -a``, both of which will rebuild the sstables on
+disk, regenerating the bloom filters in the progress.

http://git-wip-us.apache.org/repos/asf/cassandra/blob/54f7335c/doc/source/operating/cdc.rst
----------------------------------------------------------------------
diff --git a/doc/source/operating/cdc.rst b/doc/source/operating/cdc.rst
new file mode 100644
index 0000000..192f62a
--- /dev/null
+++ b/doc/source/operating/cdc.rst
@@ -0,0 +1,89 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..     http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing, software
+.. distributed under the License is distributed on an "AS IS" BASIS,
+.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+.. See the License for the specific language governing permissions and
+.. limitations under the License.
+
+.. highlight:: none
+
+Change Data Capture
+-------------------
+
+Overview
+^^^^^^^^
+
+Change data capture (CDC) provides a mechanism to flag specific tables for archival as well as rejecting writes to those
+tables once a configurable size-on-disk for the combined flushed and unflushed CDC-log is reached. An operator can
+enable CDC on a table by setting the table property ``cdc=true`` (either when :ref:`creating the table
+<create-table-statement>` or :ref:`altering it <alter-table-statement>`), after which any CommitLogSegments containing
+data for a CDC-enabled table are moved to the directory specified in ``cassandra.yaml`` on segment discard. A threshold
+of total disk space allowed is specified in the yaml at which time newly allocated CommitLogSegments will not allow CDC
+data until a consumer parses and removes data from the destination archival directory.
+
+Configuration
+^^^^^^^^^^^^^
+
+Enabling or disable CDC on a table
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+CDC is enable or disable through the `cdc` table property, for instance::
+
+    CREATE TABLE foo (a int, b text, PRIMARY KEY(a)) WITH cdc=true;
+
+    ALTER TABLE foo WITH cdc=true;
+
+    ALTER TABLE foo WITH cdc=false;
+
+cassandra.yaml parameters
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The following `cassandra.yaml` are available for CDC:
+
+``cdc_enabled`` (default: false)
+   Enable or disable CDC operations node-wide.
+``cdc_raw_directory`` (default: ``$CASSANDRA_HOME/data/cdc_raw``)
+   Destination for CommitLogSegments to be moved after all corresponding memtables are flushed.
+``cdc_free_space_in_mb``: (default: min of 4096 and 1/8th volume space)
+   Calculated as sum of all active CommitLogSegments that permit CDC + all flushed CDC segments in
+   ``cdc_raw_directory``.
+``cdc_free_space_check_interval_ms`` (default: 250)
+   When at capacity, we limit the frequency with which we re-calculate the space taken up by ``cdc_raw_directory`` to
+   prevent burning CPU cycles unnecessarily. Default is to check 4 times per second.
+
+.. _reading-commitlogsegments:
+
+Reading CommitLogSegments
+^^^^^^^^^^^^^^^^^^^^^^^^^
+This implementation included a refactor of CommitLogReplayer into `CommitLogReader.java
+<https://github.com/apache/cassandra/blob/e31e216234c6b57a531cae607e0355666007deb2/src/java/org/apache/cassandra/db/commitlog/CommitLogReader.java>`__.
+Usage is `fairly straightforward
+<https://github.com/apache/cassandra/blob/e31e216234c6b57a531cae607e0355666007deb2/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L132-L140>`__
+with a `variety of signatures
+<https://github.com/apache/cassandra/blob/e31e216234c6b57a531cae607e0355666007deb2/src/java/org/apache/cassandra/db/commitlog/CommitLogReader.java#L71-L103>`__
+available for use. In order to handle mutations read from disk, implement `CommitLogReadHandler
+<https://github.com/apache/cassandra/blob/e31e216234c6b57a531cae607e0355666007deb2/src/java/org/apache/cassandra/db/commitlog/CommitLogReadHandler.java>`__.
+
+Warnings
+^^^^^^^^
+
+**Do not enable CDC without some kind of consumption process in-place.**
+
+The initial implementation of Change Data Capture does not include a parser (see :ref:`reading-commitlogsegments` above)
+so, if CDC is enabled on a node and then on a table, the ``cdc_free_space_in_mb`` will fill up and then writes to
+CDC-enabled tables will be rejected unless some consumption process is in place.
+
+Further Reading
+^^^^^^^^^^^^^^^
+
+- `Design doc <https://docs.google.com/document/d/1ZxCWYkeZTquxsvf5hdPc0fiUnUHna8POvgt6TIzML4Y/edit>`__
+- `JIRA ticket <https://issues.apache.org/jira/browse/CASSANDRA-8844>`__

http://git-wip-us.apache.org/repos/asf/cassandra/blob/54f7335c/doc/source/operating/compaction.rst
----------------------------------------------------------------------
diff --git a/doc/source/operating/compaction.rst b/doc/source/operating/compaction.rst
new file mode 100644
index 0000000..1ce804f
--- /dev/null
+++ b/doc/source/operating/compaction.rst
@@ -0,0 +1,426 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..     http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing, software
+.. distributed under the License is distributed on an "AS IS" BASIS,
+.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+.. See the License for the specific language governing permissions and
+.. limitations under the License.
+
+.. highlight:: none
+
+.. _compaction:
+
+Compaction
+----------
+
+Types of compaction
+^^^^^^^^^^^^^^^^^^^
+
+The concept of compaction is used for different kinds of operations in Cassandra, the common thing about these
+operations is that it takes one or more sstables and output new sstables. The types of compactions are;
+
+Minor compaction
+    triggered automatically in Cassandra.
+Major compaction
+    a user executes a compaction over all sstables on the node.
+User defined compaction
+    a user triggers a compaction on a given set of sstables.
+Scrub
+    try to fix any broken sstables. This can actually remove valid data if that data is corrupted, if that happens you
+    will need to run a full repair on the node.
+Upgradesstables
+    upgrade sstables to the latest version. Run this after upgrading to a new major version.
+Cleanup
+    remove any ranges this node does not own anymore, typically triggered on neighbouring nodes after a node has been
+    bootstrapped since that node will take ownership of some ranges from those nodes.
+Secondary index rebuild
+    rebuild the secondary indexes on the node.
+Anticompaction
+    after repair the ranges that were actually repaired are split out of the sstables that existed when repair started.
+
+When is a minor compaction triggered?
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+#  When an sstable is added to the node through flushing/streaming etc.
+#  When autocompaction is enabled after being disabled (``nodetool enableautocompaction``)
+#  When compaction adds new sstables.
+#  A check for new minor compactions every 5 minutes.
+
+Merging sstables
+^^^^^^^^^^^^^^^^
+
+Compaction is about merging sstables, since partitions in sstables are sorted based on the hash of the partition key it
+is possible to efficiently merge separate sstables. Content of each partition is also sorted so each partition can be
+merged efficiently.
+
+Tombstones and Garbage Collection (GC) Grace
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Why Tombstones
+~~~~~~~~~~~~~~
+
+When a delete request is received by Cassandra it does not actually remove the data from the underlying store. Instead
+it writes a special piece of data known as a tombstone. The Tombstone represents the delete and causes all values which
+occurred before the tombstone to not appear in queries to the database. This approach is used instead of removing values
+because of the distributed nature of Cassandra.
+
+Deletes without tombstones
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Imagine a three node cluster which has the value [A] replicated to every node.::
+
+    [A], [A], [A]
+
+If one of the nodes fails and and our delete operation only removes existing values we can end up with a cluster that
+looks like::
+
+    [], [], [A]
+
+Then a repair operation would replace the value of [A] back onto the two
+nodes which are missing the value.::
+
+    [A], [A], [A]
+
+This would cause our data to be resurrected even though it had been
+deleted.
+
+Deletes with Tombstones
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Starting again with a three node cluster which has the value [A] replicated to every node.::
+
+    [A], [A], [A]
+
+If instead of removing data we add a tombstone record, our single node failure situation will look like this.::
+
+    [A, Tombstone[A]], [A, Tombstone[A]], [A]
+
+Now when we issue a repair the Tombstone will be copied to the replica, rather than the deleted data being
+resurrected.::
+
+    [A, Tombstone[A]], [A, Tombstone[A]], [A, Tombstone[A]]
+
+Our repair operation will correctly put the state of the system to what we expect with the record [A] marked as deleted
+on all nodes. This does mean we will end up accruing Tombstones which will permanently accumulate disk space. To avoid
+keeping tombstones forever we have a parameter known as ``gc_grace_seconds`` for every table in Cassandra.
+
+The gc_grace_seconds parameter and Tombstone Removal
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The table level ``gc_grace_seconds`` parameter controls how long Cassandra will retain tombstones through compaction
+events before finally removing them. This duration should directly reflect the amount of time a user expects to allow
+before recovering a failed node. After ``gc_grace_seconds`` has expired the tombstone may be removed (meaning there will
+no longer be any record that a certain piece of data was deleted), but as a tombstone can live in one sstable and the
+data it covers in another, a compaction must also include both sstable for a tombstone to be removed. More precisely, to
+be able to drop an actual tombstone the following needs to be true;
+
+- The tombstone must be older than ``gc_grace_seconds``
+- If partition X contains the tombstone, the sstable containing the partition plus all sstables containing data older
+  than the tombstone containing X must be included in the same compaction. We don't need to care if the partition is in
+  an sstable if we can guarantee that all data in that sstable is newer than the tombstone. If the tombstone is older
+  than the data it cannot shadow that data.
+- If the option ``only_purge_repaired_tombstones`` is enabled, tombstones are only removed if the data has also been
+  repaired.
+
+If a node remains down or disconnected for longer than ``gc_grace_seconds`` it's deleted data will be repaired back to
+the other nodes and re-appear in the cluster. This is basically the same as in the "Deletes without Tombstones" section.
+Note that tombstones will not be removed until a compaction event even if ``gc_grace_seconds`` has elapsed.
+
+The default value for ``gc_grace_seconds`` is 864000 which is equivalent to 10 days. This can be set when creating or
+altering a table using ``WITH gc_grace_seconds``.
+
+TTL
+^^^
+
+Data in Cassandra can have an additional property called time to live - this is used to automatically drop data that has
+expired once the time is reached. Once the TTL has expired the data is converted to a tombstone which stays around for
+at least ``gc_grace_seconds``. Note that if you mix data with TTL and data without TTL (or just different length of the
+TTL) Cassandra will have a hard time dropping the tombstones created since the partition might span many sstables and
+not all are compacted at once.
+
+Fully expired sstables
+^^^^^^^^^^^^^^^^^^^^^^
+
+If an sstable contains only tombstones and it is guaranteed that that sstable is not shadowing data in any other sstable
+compaction can drop that sstable. If you see sstables with only tombstones (note that TTL:ed data is considered
+tombstones once the time to live has expired) but it is not being dropped by compaction, it is likely that other
+sstables contain older data. There is a tool called ``sstableexpiredblockers`` that will list which sstables are
+droppable and which are blocking them from being dropped. This is especially useful for time series compaction with
+``TimeWindowCompactionStrategy`` (and the deprecated ``DateTieredCompactionStrategy``).
+
+Repaired/unrepaired data
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+With incremental repairs Cassandra must keep track of what data is repaired and what data is unrepaired. With
+anticompaction repaired data is split out into repaired and unrepaired sstables. To avoid mixing up the data again
+separate compaction strategy instances are run on the two sets of data, each instance only knowing about either the
+repaired or the unrepaired sstables. This means that if you only run incremental repair once and then never again, you
+might have very old data in the repaired sstables that block compaction from dropping tombstones in the unrepaired
+(probably newer) sstables.
+
+Data directories
+^^^^^^^^^^^^^^^^
+
+Since tombstones and data can live in different sstables it is important to realize that losing an sstable might lead to
+data becoming live again - the most common way of losing sstables is to have a hard drive break down. To avoid making
+data live tombstones and actual data are always in the same data directory. This way, if a disk is lost, all versions of
+a partition are lost and no data can get undeleted. To achieve this a compaction strategy instance per data directory is
+run in addition to the compaction strategy instances containing repaired/unrepaired data, this means that if you have 4
+data directories there will be 8 compaction strategy instances running. This has a few more benefits than just avoiding
+data getting undeleted:
+
+- It is possible to run more compactions in parallel - leveled compaction will have several totally separate levelings
+  and each one can run compactions independently from the others.
+- Users can backup and restore a single data directory.
+- Note though that currently all data directories are considered equal, so if you have a tiny disk and a big disk
+  backing two data directories, the big one will be limited the by the small one. One work around to this is to create
+  more data directories backed by the big disk.
+
+Single sstable tombstone compaction
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+When an sstable is written a histogram with the tombstone expiry times is created and this is used to try to find
+sstables with very many tombstones and run single sstable compaction on that sstable in hope of being able to drop
+tombstones in that sstable. Before starting this it is also checked how likely it is that any tombstones will actually
+will be able to be dropped how much this sstable overlaps with other sstables. To avoid most of these checks the
+compaction option ``unchecked_tombstone_compaction`` can be enabled.
+
+.. _compaction-options:
+
+Common options
+^^^^^^^^^^^^^^
+
+There is a number of common options for all the compaction strategies;
+
+``enabled`` (default: true)
+    Whether minor compactions should run. Note that you can have 'enabled': true as a compaction option and then do
+    'nodetool enableautocompaction' to start running compactions.
+``tombstone_threshold`` (default: 0.2)
+    How much of the sstable should be tombstones for us to consider doing a single sstable compaction of that sstable.
+``tombstone_compaction_interval`` (default: 86400s (1 day))
+    Since it might not be possible to drop any tombstones when doing a single sstable compaction we need to make sure
+    that one sstable is not constantly getting recompacted - this option states how often we should try for a given
+    sstable. 
+``log_all`` (default: false)
+    New detailed compaction logging, see :ref:`below <detailed-compaction-logging>`.
+``unchecked_tombstone_compaction`` (default: false)
+    The single sstable compaction has quite strict checks for whether it should be started, this option disables those
+    checks and for some usecases this might be needed.  Note that this does not change anything for the actual
+    compaction, tombstones are only dropped if it is safe to do so - it might just rewrite an sstable without being able
+    to drop any tombstones.
+``only_purge_repaired_tombstone`` (default: false)
+    Option to enable the extra safety of making sure that tombstones are only dropped if the data has been repaired.
+``min_threshold`` (default: 4)
+    Lower limit of number of sstables before a compaction is triggered. Not used for ``LeveledCompactionStrategy``.
+``max_threshold`` (default: 32)
+    Upper limit of number of sstables before a compaction is triggered. Not used for ``LeveledCompactionStrategy``.
+
+Compaction nodetool commands
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The :ref:`nodetool <nodetool>` utility provides a number of commands related to compaction:
+
+``enableautocompaction``
+    Enable compaction.
+``disableautocompaction``
+    Disable compaction.
+``setcompactionthroughput``
+    How fast compaction should run at most - defaults to 16MB/s, but note that it is likely not possible to reach this
+    throughput.
+``compactionstats``
+    Statistics about current and pending compactions.
+``compactionhistory``
+    List details about the last compactions.
+``setcompactionthreshold``
+    Set the min/max sstable count for when to trigger compaction, defaults to 4/32.
+
+Switching the compaction strategy and options using JMX
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+It is possible to switch compaction strategies and its options on just a single node using JMX, this is a great way to
+experiment with settings without affecting the whole cluster. The mbean is::
+
+    org.apache.cassandra.db:type=ColumnFamilies,keyspace=<keyspace_name>,columnfamily=<table_name>
+
+and the attribute to change is ``CompactionParameters`` or ``CompactionParametersJson`` if you use jconsole or jmc. The
+syntax for the json version is the same as you would use in an :ref:`ALTER TABLE <alter-table-statement>` statement -
+for example::
+
+    { 'class': 'LeveledCompactionStrategy', 'sstable_size_in_mb': 123 }
+
+The setting is kept until someone executes an :ref:`ALTER TABLE <alter-table-statement>` that touches the compaction
+settings or restarts the node.
+
+.. _detailed-compaction-logging:
+
+More detailed compaction logging
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Enable with the compaction option ``log_all`` and a more detailed compaction log file will be produced in your log
+directory.
+
+Size Tiered Compaction Strategy
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The basic idea of ``SizeTieredCompactionStrategy`` (STCS) is to merge sstables of approximately the same size. All
+sstables are put in different buckets depending on their size. An sstable is added to the bucket if size of the sstable
+is within ``bucket_low`` and ``bucket_high`` of the current average size of the sstables already in the bucket. This
+will create several buckets and the most interesting of those buckets will be compacted. The most interesting one is
+decided by figuring out which bucket's sstables takes the most reads.
+
+Major compaction
+~~~~~~~~~~~~~~~~
+
+When running a major compaction with STCS you will end up with two sstables per data directory (one for repaired data
+and one for unrepaired data). There is also an option (-s) to do a major compaction that splits the output into several
+sstables. The sizes of the sstables are approximately 50%, 25%, 12.5%... of the total size.
+
+.. _stcs-options:
+
+STCS options
+~~~~~~~~~~~~
+
+``min_sstable_size`` (default: 50MB)
+    Sstables smaller than this are put in the same bucket.
+``bucket_low`` (default: 0.5)
+    How much smaller than the average size of a bucket a sstable should be before not being included in the bucket. That
+    is, if ``bucket_low * avg_bucket_size < sstable_size`` (and the ``bucket_high`` condition holds, see below), then
+    the sstable is added to the bucket.
+``bucket_high`` (default: 1.5)
+    How much bigger than the average size of a bucket a sstable should be before not being included in the bucket. That
+    is, if ``sstable_size < bucket_high * avg_bucket_size`` (and the ``bucket_low`` condition holds, see above), then
+    the sstable is added to the bucket.
+
+Defragmentation
+~~~~~~~~~~~~~~~
+
+Defragmentation is done when many sstables are touched during a read.  The result of the read is put in to the memtable
+so that the next read will not have to touch as many sstables. This can cause writes on a read-only-cluster.
+
+Leveled Compaction Strategy
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The idea of ``LeveledCompactionStrategy`` (LCS) is that all sstables are put into different levels where we guarantee
+that no overlapping sstables are in the same level. By overlapping we mean that the first/last token of a single sstable
+are never overlapping with other sstables. This means that for a SELECT we will only have to look for the partition key
+in a single sstable per level. Each level is 10x the size of the previous one and each sstable is 160MB by default. L0
+is where sstables are streamed/flushed - no overlap guarantees are given here.
+
+When picking compaction candidates we have to make sure that the compaction does not create overlap in the target level.
+This is done by always including all overlapping sstables in the next level. For example if we select an sstable in L3,
+we need to guarantee that we pick all overlapping sstables in L4 and make sure that no currently ongoing compactions
+will create overlap if we start that compaction. We can start many parallel compactions in a level if we guarantee that
+we wont create overlap. For L0 -> L1 compactions we almost always need to include all L1 sstables since most L0 sstables
+cover the full range. We also can't compact all L0 sstables with all L1 sstables in a single compaction since that can
+use too much memory.
+
+When deciding which level to compact LCS checks the higher levels first (with LCS, a "higher" level is one with a higher
+number, L0 being the lowest one) and if the level is behind a compaction will be started in that level.
+
+Major compaction
+~~~~~~~~~~~~~~~~
+
+It is possible to do a major compaction with LCS - it will currently start by filling out L1 and then once L1 is full,
+it continues with L2 etc. This is sub optimal and will change to create all the sstables in a high level instead,
+CASSANDRA-11817.
+
+Bootstrapping
+~~~~~~~~~~~~~
+
+During bootstrap sstables are streamed from other nodes. The level of the remote sstable is kept to avoid many
+compactions after the bootstrap is done. During bootstrap the new node also takes writes while it is streaming the data
+from a remote node - these writes are flushed to L0 like all other writes and to avoid those sstables blocking the
+remote sstables from going to the correct level, we only do STCS in L0 until the bootstrap is done.
+
+STCS in L0
+~~~~~~~~~~
+
+If LCS gets very many L0 sstables reads are going to hit all (or most) of the L0 sstables since they are likely to be
+overlapping. To more quickly remedy this LCS does STCS compactions in L0 if there are more than 32 sstables there. This
+should improve read performance more quickly compared to letting LCS do its L0 -> L1 compactions. If you keep getting
+too many sstables in L0 it is likely that LCS is not the best fit for your workload and STCS could work out better.
+
+Starved sstables
+~~~~~~~~~~~~~~~~
+
+If a node ends up with a leveling where there are a few very high level sstables that are not getting compacted they
+might make it impossible for lower levels to drop tombstones etc. For example, if there are sstables in L6 but there is
+only enough data to actually get a L4 on the node the left over sstables in L6 will get starved and not compacted.  This
+can happen if a user changes sstable\_size\_in\_mb from 5MB to 160MB for example. To avoid this LCS tries to include
+those starved high level sstables in other compactions if there has been 25 compaction rounds where the highest level
+has not been involved.
+
+.. _lcs-options:
+
+LCS options
+~~~~~~~~~~~
+
+``sstable_size_in_mb`` (default: 160MB)
+    The target compressed (if using compression) sstable size - the sstables can end up being larger if there are very
+    large partitions on the node.
+
+LCS also support the ``cassandra.disable_stcs_in_l0`` startup option (``-Dcassandra.disable_stcs_in_l0=true``) to avoid
+doing STCS in L0.
+
+.. _twcs:
+
+Time Window CompactionStrategy
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+``TimeWindowCompactionStrategy`` (TWCS) is designed specifically for workloads where it's beneficial to have data on
+disk grouped by the timestamp of the data, a common goal when the workload is time-series in nature or when all data is
+written with a TTL. In an expiring/TTL workload, the contents of an entire SSTable likely expire at approximately the
+same time, allowing them to be dropped completely, and space reclaimed much more reliably than when using
+``SizeTieredCompactionStrategy`` or ``LeveledCompactionStrategy``. The basic concept is that
+``TimeWindowCompactionStrategy`` will create 1 sstable per file for a given window, where a window is simply calculated
+as the combination of two primary options:
+
+``compaction_window_unit`` (default: DAYS)
+    A Java TimeUnit (MINUTES, HOURS, or DAYS).
+``compaction_window_size`` (default: 1)
+    The number of units that make up a window.
+
+Taken together, the operator can specify windows of virtually any size, and `TimeWindowCompactionStrategy` will work to
+create a single sstable for writes within that window. For efficiency during writing, the newest window will be
+compacted using `SizeTieredCompactionStrategy`.
+
+Ideally, operators should select a ``compaction_window_unit`` and ``compaction_window_size`` pair that produces
+approximately 20-30 windows - if writing with a 90 day TTL, for example, a 3 Day window would be a reasonable choice
+(``'compaction_window_unit':'DAYS','compaction_window_size':3``).
+
+TimeWindowCompactionStrategy Operational Concerns
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The primary motivation for TWCS is to separate data on disk by timestamp and to allow fully expired SSTables to drop
+more efficiently. One potential way this optimal behavior can be subverted is if data is written to SSTables out of
+order, with new data and old data in the same SSTable. Out of order data can appear in two ways:
+
+- If the user mixes old data and new data in the traditional write path, the data will be comingled in the memtables
+  and flushed into the same SSTable, where it will remain comingled.
+- If the user's read requests for old data cause read repairs that pull old data into the current memtable, that data
+  will be comingled and flushed into the same SSTable.
+
+While TWCS tries to minimize the impact of comingled data, users should attempt to avoid this behavior.  Specifically,
+users should avoid queries that explicitly set the timestamp via CQL ``USING TIMESTAMP``. Additionally, users should run
+frequent repairs (which streams data in such a way that it does not become comingled), and disable background read
+repair by setting the table's ``read_repair_chance`` and ``dclocal_read_repair_chance`` to 0.
+
+Changing TimeWindowCompactionStrategy Options
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Operators wishing to enable ``TimeWindowCompactionStrategy`` on existing data should consider running a major compaction
+first, placing all existing data into a single (old) window. Subsequent newer writes will then create typical SSTables
+as expected.
+
+Operators wishing to change ``compaction_window_unit`` or ``compaction_window_size`` can do so, but may trigger
+additional compactions as adjacent windows are joined together. If the window size is decrease d (for example, from 24
+hours to 12 hours), then the existing SSTables will not be modified - TWCS can not split existing SSTables into multiple
+windows.

http://git-wip-us.apache.org/repos/asf/cassandra/blob/54f7335c/doc/source/operating/compression.rst
----------------------------------------------------------------------
diff --git a/doc/source/operating/compression.rst b/doc/source/operating/compression.rst
new file mode 100644
index 0000000..5876214
--- /dev/null
+++ b/doc/source/operating/compression.rst
@@ -0,0 +1,94 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..     http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing, software
+.. distributed under the License is distributed on an "AS IS" BASIS,
+.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+.. See the License for the specific language governing permissions and
+.. limitations under the License.
+
+.. highlight:: none
+
+Compression
+-----------
+
+Cassandra offers operators the ability to configure compression on a per-table basis. Compression reduces the size of
+data on disk by compressing the SSTable in user-configurable compression ``chunk_length_in_kb``. Because Cassandra
+SSTables are immutable, the CPU cost of compressing is only necessary when the SSTable is written - subsequent updates
+to data will land in different SSTables, so Cassandra will not need to decompress, overwrite, and recompress data when
+UPDATE commands are issued. On reads, Cassandra will locate the relevant compressed chunks on disk, decompress the full
+chunk, and then proceed with the remainder of the read path (merging data from disks and memtables, read repair, and so
+on).
+
+Configuring Compression
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Compression is configured on a per-table basis as an optional argument to ``CREATE TABLE`` or ``ALTER TABLE``. By
+default, three options are relevant:
+
+- ``class`` specifies the compression class - Cassandra provides three classes (``LZ4Compressor``,
+  ``SnappyCompressor``, and ``DeflateCompressor`` ). The default is ``SnappyCompressor``.
+- ``chunk_length_in_kb`` specifies the number of kilobytes of data per compression chunk. The default is 64KB.
+- ``crc_check_chance`` determines how likely Cassandra is to verify the checksum on each compression chunk during
+  reads. The default is 1.0.
+
+Users can set compression using the following syntax:
+
+::
+
+    CREATE TABLE keyspace.table (id int PRIMARY KEY) WITH compression = {'class': 'LZ4Compressor'};
+
+Or
+
+::
+
+    ALTER TABLE keyspace.table WITH compression = {'class': 'SnappyCompressor', 'chunk_length_in_kb': 128, 'crc_check_chance': 0.5};
+
+Once enabled, compression can be disabled with ``ALTER TABLE`` setting ``enabled`` to ``false``:
+
+::
+
+    ALTER TABLE keyspace.table WITH compression = {'enabled':'false'};
+
+Operators should be aware, however, that changing compression is not immediate. The data is compressed when the SSTable
+is written, and as SSTables are immutable, the compression will not be modified until the table is compacted. Upon
+issuing a change to the compression options via ``ALTER TABLE``, the existing SSTables will not be modified until they
+are compacted - if an operator needs compression changes to take effect immediately, the operator can trigger an SSTable
+rewrite using ``nodetool scrub`` or ``nodetool upgradesstables -a``, both of which will rebuild the SSTables on disk,
+re-compressing the data in the process.
+
+Benefits and Uses
+^^^^^^^^^^^^^^^^^
+
+Compression's primary benefit is that it reduces the amount of data written to disk. Not only does the reduced size save
+in storage requirements, it often increases read and write throughput, as the CPU overhead of compressing data is faster
+than the time it would take to read or write the larger volume of uncompressed data from disk.
+
+Compression is most useful in tables comprised of many rows, where the rows are similar in nature. Tables containing
+similar text columns (such as repeated JSON blobs) often compress very well.
+
+Operational Impact
+^^^^^^^^^^^^^^^^^^
+
+- Compression metadata is stored off-heap and scales with data on disk.  This often requires 1-3GB of off-heap RAM per
+  terabyte of data on disk, though the exact usage varies with ``chunk_length_in_kb`` and compression ratios.
+
+- Streaming operations involve compressing and decompressing data on compressed tables - in some code paths (such as
+  non-vnode bootstrap), the CPU overhead of compression can be a limiting factor.
+
+- The compression path checksums data to ensure correctness - while the traditional Cassandra read path does not have a
+  way to ensure correctness of data on disk, compressed tables allow the user to set ``crc_check_chance`` (a float from
+  0.0 to 1.0) to allow Cassandra to probabilistically validate chunks on read to verify bits on disk are not corrupt.
+
+Advanced Use
+^^^^^^^^^^^^
+
+Advanced users can provide their own compression class by implementing the interface at
+``org.apache.cassandra.io.compress.ICompressor``.

http://git-wip-us.apache.org/repos/asf/cassandra/blob/54f7335c/doc/source/operating/hardware.rst
----------------------------------------------------------------------
diff --git a/doc/source/operating/hardware.rst b/doc/source/operating/hardware.rst
new file mode 100644
index 0000000..ad3aa8d
--- /dev/null
+++ b/doc/source/operating/hardware.rst
@@ -0,0 +1,87 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..     http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing, software
+.. distributed under the License is distributed on an "AS IS" BASIS,
+.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+.. See the License for the specific language governing permissions and
+.. limitations under the License.
+
+Hardware Choices
+----------------
+
+Like most databases, Cassandra throughput improves with more CPU cores, more RAM, and faster disks. While Cassandra can
+be made to run on small servers for testing or development environments (including Raspberry Pis), a minimal production
+server requires at least 2 cores, and at least 8GB of RAM. Typical production servers have 8 or more cores and at least
+32GB of RAM.
+
+CPU
+^^^
+Cassandra is highly concurrent, handling many simultaneous requests (both read and write) using multiple threads running
+on as many CPU cores as possible. The Cassandra write path tends to be heavily optimized (writing to the commitlog and
+then inserting the data into the memtable), so writes, in particular, tend to be CPU bound. Consequently, adding
+additional CPU cores often increases throughput of both reads and writes.
+
+Memory
+^^^^^^
+Cassandra runs within a Java VM, which will pre-allocate a fixed size heap (java's Xmx system parameter). In addition to
+the heap, Cassandra will use significant amounts of RAM offheap for compression metadata, bloom filters, row, key, and
+counter caches, and an in process page cache. Finally, Cassandra will take advantage of the operating system's page
+cache, storing recently accessed portions files in RAM for rapid re-use.
+
+For optimal performance, operators should benchmark and tune their clusters based on their individual workload. However,
+basic guidelines suggest:
+
+-  ECC RAM should always be used, as Cassandra has few internal safeguards to protect against bit level corruption
+-  The Cassandra heap should be no less than 2GB, and no more than 50% of your system RAM
+-  Heaps smaller than 12GB should consider ParNew/ConcurrentMarkSweep garbage collection
+-  Heaps larger than 12GB should consider G1GC
+
+Disks
+^^^^^
+Cassandra persists data to disk for two very different purposes. The first is to the commitlog when a new write is made
+so that it can be replayed after a crash or system shutdown. The second is to the data directory when thresholds are
+exceeded and memtables are flushed to disk as SSTables.
+
+Commitlogs receive every write made to a Cassandra node and have the potential to block client operations, but they are
+only ever read on node start-up. SSTable (data file) writes on the other hand occur asynchronously, but are read to
+satisfy client look-ups. SSTables are also periodically merged and rewritten in a process called compaction.  The data
+held in the commitlog directory is data that has not been permanently saved to the SSTable data directories - it will be
+periodically purged once it is flushed to the SSTable data files.
+
+Cassandra performs very well on both spinning hard drives and solid state disks. In both cases, Cassandra's sorted
+immutable SSTables allow for linear reads, few seeks, and few overwrites, maximizing throughput for HDDs and lifespan of
+SSDs by avoiding write amplification. However, when using spinning disks, it's important that the commitlog
+(``commitlog_directory``) be on one physical disk (not simply a partition, but a physical disk), and the data files
+(``data_file_directories``) be set to a separate physical disk. By separating the commitlog from the data directory,
+writes can benefit from sequential appends to the commitlog without having to seek around the platter as reads request
+data from various SSTables on disk.
+
+In most cases, Cassandra is designed to provide redundancy via multiple independent, inexpensive servers. For this
+reason, using NFS or a SAN for data directories is an antipattern and should typically be avoided.  Similarly, servers
+with multiple disks are often better served by using RAID0 or JBOD than RAID1 or RAID5 - replication provided by
+Cassandra obsoletes the need for replication at the disk layer, so it's typically recommended that operators take
+advantage of the additional throughput of RAID0 rather than protecting against failures with RAID1 or RAID5.
+
+Common Cloud Choices
+^^^^^^^^^^^^^^^^^^^^
+
+Many large users of Cassandra run in various clouds, including AWS, Azure, and GCE - Cassandra will happily run in any
+of these environments. Users should choose similar hardware to what would be needed in physical space. In EC2, popular
+options include:
+
+- m1.xlarge instances, which provide 1.6TB of local ephemeral spinning storage and sufficient RAM to run moderate
+  workloads
+- i2 instances, which provide both a high RAM:CPU ratio and local ephemeral SSDs
+- m4.2xlarge / c4.4xlarge instances, which provide modern CPUs, enhanced networking and work well with EBS GP2 (SSD)
+  storage
+
+Generally, disk and network performance increases with instance size and generation, so newer generations of instances
+and larger instance types within each family often perform better than their smaller or older alternatives.

http://git-wip-us.apache.org/repos/asf/cassandra/blob/54f7335c/doc/source/operating/hints.rst
----------------------------------------------------------------------
diff --git a/doc/source/operating/hints.rst b/doc/source/operating/hints.rst
new file mode 100644
index 0000000..f79f18a
--- /dev/null
+++ b/doc/source/operating/hints.rst
@@ -0,0 +1,22 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..     http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing, software
+.. distributed under the License is distributed on an "AS IS" BASIS,
+.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+.. See the License for the specific language governing permissions and
+.. limitations under the License.
+
+.. highlight:: none
+
+Hints
+-----
+
+.. todo:: todo

http://git-wip-us.apache.org/repos/asf/cassandra/blob/54f7335c/doc/source/operating/index.rst
----------------------------------------------------------------------
diff --git a/doc/source/operating/index.rst b/doc/source/operating/index.rst
new file mode 100644
index 0000000..6fc27c8
--- /dev/null
+++ b/doc/source/operating/index.rst
@@ -0,0 +1,38 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..     http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing, software
+.. distributed under the License is distributed on an "AS IS" BASIS,
+.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+.. See the License for the specific language governing permissions and
+.. limitations under the License.
+
+.. highlight:: none
+
+Operating Cassandra
+===================
+
+.. toctree::
+   :maxdepth: 2
+
+   snitch
+   topo_changes
+   repair
+   read_repair
+   hints
+   compaction
+   bloom_filters
+   compression
+   cdc
+   backups
+   metrics
+   security
+   hardware
+


Mime
View raw message