Return-Path: X-Original-To: apmail-hbase-commits-archive@www.apache.org Delivered-To: apmail-hbase-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9C66817E91 for ; Wed, 28 Jan 2015 03:30:52 +0000 (UTC) Received: (qmail 39116 invoked by uid 500); 28 Jan 2015 03:30:48 -0000 Delivered-To: apmail-hbase-commits-archive@hbase.apache.org Received: (qmail 38828 invoked by uid 500); 28 Jan 2015 03:30:47 -0000 Mailing-List: contact commits-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list commits@hbase.apache.org Received: (qmail 38800 invoked by uid 99); 28 Jan 2015 03:30:47 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Jan 2015 03:30:47 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id D8B3AE0E3D; Wed, 28 Jan 2015 03:30:46 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: enis@apache.org To: commits@hbase.apache.org Date: Wed, 28 Jan 2015 03:30:47 -0000 Message-Id: <38a3636b904844aca5d59030ffcea4f5@git.apache.org> In-Reply-To: References: X-Mailer: ASF-Git Admin Mailer Subject: [02/34] hbase git commit: HBASE-12918 Backport asciidoc changes (apurtell and enis) http://git-wip-us.apache.org/repos/asf/hbase/blob/cb77a925/src/main/docbkx/zookeeper.xml ---------------------------------------------------------------------- diff --git a/src/main/docbkx/zookeeper.xml b/src/main/docbkx/zookeeper.xml deleted file mode 100644 index 206ccf5..0000000 --- a/src/main/docbkx/zookeeper.xml +++ /dev/null @@ -1,513 +0,0 @@ - - - - - ZooKeeper<indexterm> - <primary>ZooKeeper</primary> - </indexterm> - - A distributed Apache HBase installation depends on a running ZooKeeper cluster. All - participating nodes and clients need to be able to access the running ZooKeeper ensemble. Apache - HBase by default manages a ZooKeeper "cluster" for you. It will start and stop the ZooKeeper - ensemble as part of the HBase start/stop process. You can also manage the ZooKeeper ensemble - independent of HBase and just point HBase at the cluster it should use. To toggle HBase - management of ZooKeeper, use the HBASE_MANAGES_ZK variable in - conf/hbase-env.sh. This variable, which defaults to - true, tells HBase whether to start/stop the ZooKeeper ensemble servers as - part of HBase start/stop. - - When HBase manages the ZooKeeper ensemble, you can specify ZooKeeper configuration using its - native zoo.cfg file, or, the easier option is to just specify ZooKeeper - options directly in conf/hbase-site.xml. A ZooKeeper configuration option - can be set as a property in the HBase hbase-site.xml XML configuration file - by prefacing the ZooKeeper option name with hbase.zookeeper.property. For - example, the clientPort setting in ZooKeeper can be changed by setting the - hbase.zookeeper.property.clientPort property. For all default values used - by HBase, including ZooKeeper configuration, see . Look for the - hbase.zookeeper.property prefix. For the full list of ZooKeeper configurations, see ZooKeeper's - zoo.cfg. HBase does not ship with a zoo.cfg so - you will need to browse the conf directory in an appropriate ZooKeeper - download. - - You must at least list the ensemble servers in hbase-site.xml using the - hbase.zookeeper.quorum property. This property defaults to a single - ensemble member at localhost which is not suitable for a fully distributed - HBase. (It binds to the local machine only and remote clients will not be able to connect). - - How many ZooKeepers should I run? - - You can run a ZooKeeper ensemble that comprises 1 node only but in production it is - recommended that you run a ZooKeeper ensemble of 3, 5 or 7 machines; the more members an - ensemble has, the more tolerant the ensemble is of host failures. Also, run an odd number of - machines. In ZooKeeper, an even number of peers is supported, but it is normally not used - because an even sized ensemble requires, proportionally, more peers to form a quorum than an - odd sized ensemble requires. For example, an ensemble with 4 peers requires 3 to form a - quorum, while an ensemble with 5 also requires 3 to form a quorum. Thus, an ensemble of 5 - allows 2 peers to fail, and thus is more fault tolerant than the ensemble of 4, which allows - only 1 down peer. - Give each ZooKeeper server around 1GB of RAM, and if possible, its own dedicated disk (A - dedicated disk is the best thing you can do to ensure a performant ZooKeeper ensemble). For - very heavily loaded clusters, run ZooKeeper servers on separate machines from RegionServers - (DataNodes and TaskTrackers). - - - For example, to have HBase manage a ZooKeeper quorum on nodes - rs{1,2,3,4,5}.example.com, bound to port 2222 (the default is 2181) - ensure HBASE_MANAGE_ZK is commented out or set to true in - conf/hbase-env.sh and then edit conf/hbase-site.xml - and set hbase.zookeeper.property.clientPort and - hbase.zookeeper.quorum. You should also set - hbase.zookeeper.property.dataDir to other than the default as the default - has ZooKeeper persist data under /tmp which is often cleared on system - restart. In the example below we have ZooKeeper persist to - /user/local/zookeeper. - - ... - - hbase.zookeeper.property.clientPort - 2222 - Property from ZooKeeper's config zoo.cfg. - The port at which the clients will connect. - - - - hbase.zookeeper.quorum - rs1.example.com,rs2.example.com,rs3.example.com,rs4.example.com,rs5.example.com - Comma separated list of servers in the ZooKeeper Quorum. - For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com". - By default this is set to localhost for local and pseudo-distributed modes - of operation. For a fully-distributed setup, this should be set to a full - list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in hbase-env.sh - this is the list of servers which we will start/stop ZooKeeper on. - - - - hbase.zookeeper.property.dataDir - /usr/local/zookeeper - Property from ZooKeeper's config zoo.cfg. - The directory where the snapshot is stored. - - - ... - ]]> - - What verion of ZooKeeper should I use? - The newer version, the better. For example, some folks have been bitten by ZOOKEEPER-1277. If - running zookeeper 3.5+, you can ask hbase to make use of the new multi operation by enabling " in your hbase-site.xml. - - - ZooKeeper Maintenance - Be sure to set up the data dir cleaner described under Zookeeper - Maintenance else you could have 'interesting' problems a couple of months in; i.e. - zookeeper could start dropping sessions if it has to run through a directory of hundreds of - thousands of logs which is wont to do around leader reelection time -- a process rare but run - on occasion whether because a machine is dropped or happens to hiccup. - - -
- Using existing ZooKeeper ensemble - - To point HBase at an existing ZooKeeper cluster, one that is not managed by HBase, set - HBASE_MANAGES_ZK in conf/hbase-env.sh to - false - - ... - # Tell HBase whether it should manage its own instance of Zookeeper or not. - export HBASE_MANAGES_ZK=false - Next set ensemble locations and client port, if non-standard, in - hbase-site.xml, or add a suitably configured - zoo.cfg to HBase's CLASSPATH. HBase will prefer - the configuration found in zoo.cfg over any settings in - hbase-site.xml. - - When HBase manages ZooKeeper, it will start/stop the ZooKeeper servers as a part of the - regular start/stop scripts. If you would like to run ZooKeeper yourself, independent of HBase - start/stop, you would do the following - - -${HBASE_HOME}/bin/hbase-daemons.sh {start,stop} zookeeper - - - Note that you can use HBase in this manner to spin up a ZooKeeper cluster, unrelated to - HBase. Just make sure to set HBASE_MANAGES_ZK to false - if you want it to stay up across HBase restarts so that when HBase shuts down, it doesn't take - ZooKeeper down with it. - - For more information about running a distinct ZooKeeper cluster, see the ZooKeeper Getting - Started Guide. Additionally, see the ZooKeeper Wiki or the ZooKeeper - documentation for more information on ZooKeeper sizing. -
- - -
- SASL Authentication with ZooKeeper - Newer releases of Apache HBase (>= 0.92) will support connecting to a ZooKeeper Quorum - that supports SASL authentication (which is available in Zookeeper versions 3.4.0 or - later). - - This describes how to set up HBase to mutually authenticate with a ZooKeeper Quorum. - ZooKeeper/HBase mutual authentication (HBASE-2418) is required - as part of a complete secure HBase configuration (HBASE-3025). For - simplicity of explication, this section ignores additional configuration required (Secure HDFS - and Coprocessor configuration). It's recommended to begin with an HBase-managed Zookeeper - configuration (as opposed to a standalone Zookeeper quorum) for ease of learning. - -
- Operating System Prerequisites - - You need to have a working Kerberos KDC setup. For each $HOST that will - run a ZooKeeper server, you should have a principle zookeeper/$HOST. For each - such host, add a service key (using the kadmin or kadmin.local - tool's ktadd command) for zookeeper/$HOST and copy this file to - $HOST, and make it readable only to the user that will run zookeeper on - $HOST. Note the location of this file, which we will use below as - $PATH_TO_ZOOKEEPER_KEYTAB. - - Similarly, for each $HOST that will run an HBase server (master or - regionserver), you should have a principle: hbase/$HOST. For each host, add a - keytab file called hbase.keytab containing a service key for - hbase/$HOST, copy this file to $HOST, and make it readable only - to the user that will run an HBase service on $HOST. Note the location of this - file, which we will use below as $PATH_TO_HBASE_KEYTAB. - - Each user who will be an HBase client should also be given a Kerberos principal. This - principal should usually have a password assigned to it (as opposed to, as with the HBase - servers, a keytab file) which only this user knows. The client's principal's - maxrenewlife should be set so that it can be renewed enough so that the user - can complete their HBase client processes. For example, if a user runs a long-running HBase - client process that takes at most 3 days, we might create this user's principal within - kadmin with: addprinc -maxrenewlife 3days. The Zookeeper client - and server libraries manage their own ticket refreshment by running threads that wake up - periodically to do the refreshment. - - On each host that will run an HBase client (e.g. hbase shell), add the - following file to the HBase home directory's conf directory: - - -Client { - com.sun.security.auth.module.Krb5LoginModule required - useKeyTab=false - useTicketCache=true; -}; - - - We'll refer to this JAAS configuration file as $CLIENT_CONF - below. -
-
- HBase-managed Zookeeper Configuration - - On each node that will run a zookeeper, a master, or a regionserver, create a JAAS - configuration file in the conf directory of the node's HBASE_HOME - directory that looks like the following: - - -Server { - com.sun.security.auth.module.Krb5LoginModule required - useKeyTab=true - keyTab="$PATH_TO_ZOOKEEPER_KEYTAB" - storeKey=true - useTicketCache=false - principal="zookeeper/$HOST"; -}; -Client { - com.sun.security.auth.module.Krb5LoginModule required - useKeyTab=true - useTicketCache=false - keyTab="$PATH_TO_HBASE_KEYTAB" - principal="hbase/$HOST"; -}; - - - where the $PATH_TO_HBASE_KEYTAB and - $PATH_TO_ZOOKEEPER_KEYTAB files are what you created above, and - $HOST is the hostname for that node. - - The Server section will be used by the Zookeeper quorum server, while the - Client section will be used by the HBase master and regionservers. The path - to this file should be substituted for the text $HBASE_SERVER_CONF in - the hbase-env.sh listing below. - - The path to this file should be substituted for the text - $CLIENT_CONF in the hbase-env.sh listing below. - - Modify your hbase-env.sh to include the following: - - -export HBASE_OPTS="-Djava.security.auth.login.config=$CLIENT_CONF" -export HBASE_MANAGES_ZK=true -export HBASE_ZOOKEEPER_OPTS="-Djava.security.auth.login.config=$HBASE_SERVER_CONF" -export HBASE_MASTER_OPTS="-Djava.security.auth.login.config=$HBASE_SERVER_CONF" -export HBASE_REGIONSERVER_OPTS="-Djava.security.auth.login.config=$HBASE_SERVER_CONF" - - - where $HBASE_SERVER_CONF and $CLIENT_CONF are - the full paths to the JAAS configuration files created above. - - Modify your hbase-site.xml on each node that will run zookeeper, - master or regionserver to contain: - - - - hbase.zookeeper.quorum - $ZK_NODES - - - hbase.cluster.distributed - true - - - hbase.zookeeper.property.authProvider.1 - org.apache.zookeeper.server.auth.SASLAuthenticationProvider - - - hbase.zookeeper.property.kerberos.removeHostFromPrincipal - true - - - hbase.zookeeper.property.kerberos.removeRealmFromPrincipal - true - - - ]]> - - where $ZK_NODES is the comma-separated list of hostnames of the Zookeeper - Quorum hosts. - - Start your hbase cluster by running one or more of the following set of commands on the - appropriate hosts: - - -bin/hbase zookeeper start -bin/hbase master start -bin/hbase regionserver start - - -
- -
- External Zookeeper Configuration - Add a JAAS configuration file that looks like: - -Client { - com.sun.security.auth.module.Krb5LoginModule required - useKeyTab=true - useTicketCache=false - keyTab="$PATH_TO_HBASE_KEYTAB" - principal="hbase/$HOST"; -}; - - where the $PATH_TO_HBASE_KEYTAB is the keytab created above for - HBase services to run on this host, and $HOST is the hostname for that node. - Put this in the HBase home's configuration directory. We'll refer to this file's full - pathname as $HBASE_SERVER_CONF below. - - Modify your hbase-env.sh to include the following: - - -export HBASE_OPTS="-Djava.security.auth.login.config=$CLIENT_CONF" -export HBASE_MANAGES_ZK=false -export HBASE_MASTER_OPTS="-Djava.security.auth.login.config=$HBASE_SERVER_CONF" -export HBASE_REGIONSERVER_OPTS="-Djava.security.auth.login.config=$HBASE_SERVER_CONF" - - - - Modify your hbase-site.xml on each node that will run a master or - regionserver to contain: - - - - hbase.zookeeper.quorum - $ZK_NODES - - - hbase.cluster.distributed - true - - - ]]> - - - where $ZK_NODES is the comma-separated list of hostnames of the Zookeeper - Quorum hosts. - - Add a zoo.cfg for each Zookeeper Quorum host containing: - -authProvider.1=org.apache.zookeeper.server.auth.SASLAuthenticationProvider -kerberos.removeHostFromPrincipal=true -kerberos.removeRealmFromPrincipal=true - - Also on each of these hosts, create a JAAS configuration file containing: - -Server { - com.sun.security.auth.module.Krb5LoginModule required - useKeyTab=true - keyTab="$PATH_TO_ZOOKEEPER_KEYTAB" - storeKey=true - useTicketCache=false - principal="zookeeper/$HOST"; -}; - - where $HOST is the hostname of each Quorum host. We will refer to the full - pathname of this file as $ZK_SERVER_CONF below. - - Start your Zookeepers on each Zookeeper Quorum host with: - -SERVER_JVMFLAGS="-Djava.security.auth.login.config=$ZK_SERVER_CONF" bin/zkServer start - - - Start your HBase cluster by running one or more of the following set of commands on the - appropriate nodes: - - -bin/hbase master start -bin/hbase regionserver start - - - -
- -
- Zookeeper Server Authentication Log Output - If the configuration above is successful, you should see something similar to the - following in your Zookeeper server logs: - -11/12/05 22:43:39 INFO zookeeper.Login: successfully logged in. -11/12/05 22:43:39 INFO server.NIOServerCnxnFactory: binding to port 0.0.0.0/0.0.0.0:2181 -11/12/05 22:43:39 INFO zookeeper.Login: TGT refresh thread started. -11/12/05 22:43:39 INFO zookeeper.Login: TGT valid starting at: Mon Dec 05 22:43:39 UTC 2011 -11/12/05 22:43:39 INFO zookeeper.Login: TGT expires: Tue Dec 06 22:43:39 UTC 2011 -11/12/05 22:43:39 INFO zookeeper.Login: TGT refresh sleeping until: Tue Dec 06 18:36:42 UTC 2011 -.. -11/12/05 22:43:59 INFO auth.SaslServerCallbackHandler: - Successfully authenticated client: authenticationID=hbase/ip-10-166-175-249.us-west-1.compute.internal@HADOOP.LOCALDOMAIN; - authorizationID=hbase/ip-10-166-175-249.us-west-1.compute.internal@HADOOP.LOCALDOMAIN. -11/12/05 22:43:59 INFO auth.SaslServerCallbackHandler: Setting authorizedID: hbase -11/12/05 22:43:59 INFO server.ZooKeeperServer: adding SASL authorization for authorizationID: hbase - - -
- -
- Zookeeper Client Authentication Log Output - On the Zookeeper client side (HBase master or regionserver), you should see something - similar to the following: - -11/12/05 22:43:59 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=ip-10-166-175-249.us-west-1.compute.internal:2181 sessionTimeout=180000 watcher=master:60000 -11/12/05 22:43:59 INFO zookeeper.ClientCnxn: Opening socket connection to server /10.166.175.249:2181 -11/12/05 22:43:59 INFO zookeeper.RecoverableZooKeeper: The identifier of this process is 14851@ip-10-166-175-249 -11/12/05 22:43:59 INFO zookeeper.Login: successfully logged in. -11/12/05 22:43:59 INFO client.ZooKeeperSaslClient: Client will use GSSAPI as SASL mechanism. -11/12/05 22:43:59 INFO zookeeper.Login: TGT refresh thread started. -11/12/05 22:43:59 INFO zookeeper.ClientCnxn: Socket connection established to ip-10-166-175-249.us-west-1.compute.internal/10.166.175.249:2181, initiating session -11/12/05 22:43:59 INFO zookeeper.Login: TGT valid starting at: Mon Dec 05 22:43:59 UTC 2011 -11/12/05 22:43:59 INFO zookeeper.Login: TGT expires: Tue Dec 06 22:43:59 UTC 2011 -11/12/05 22:43:59 INFO zookeeper.Login: TGT refresh sleeping until: Tue Dec 06 18:30:37 UTC 2011 -11/12/05 22:43:59 INFO zookeeper.ClientCnxn: Session establishment complete on server ip-10-166-175-249.us-west-1.compute.internal/10.166.175.249:2181, sessionid = 0x134106594320000, negotiated timeout = 180000 - -
- -
- Configuration from Scratch - - This has been tested on the current standard Amazon Linux AMI. First setup KDC and - principals as described above. Next checkout code and run a sanity check. - - -git clone git://git.apache.org/hbase.git -cd hbase -mvn clean test -Dtest=TestZooKeeperACL - - - Then configure HBase as described above. Manually edit target/cached_classpath.txt (see - below): - -bin/hbase zookeeper & -bin/hbase master & -bin/hbase regionserver & - -
- - -
- Future improvements - -
- Fix target/cached_classpath.txt - You must override the standard hadoop-core jar file from the - target/cached_classpath.txt file with the version containing the - HADOOP-7070 fix. You can use the following script to do this: - -echo `find ~/.m2 -name "*hadoop-core*7070*SNAPSHOT.jar"` ':' `cat target/cached_classpath.txt` | sed 's/ //g' > target/tmp.txt -mv target/tmp.txt target/cached_classpath.txt - -
- -
- Set JAAS configuration programmatically - - - This would avoid the need for a separate Hadoop jar that fixes HADOOP-7070. - -
- -
- Elimination of <code>kerberos.removeHostFromPrincipal</code> and - <code>kerberos.removeRealmFromPrincipal</code> - -
- -
- - -
- - - - - -
http://git-wip-us.apache.org/repos/asf/hbase/blob/cb77a925/src/main/site/asciidoc/acid-semantics.adoc ---------------------------------------------------------------------- diff --git a/src/main/site/asciidoc/acid-semantics.adoc b/src/main/site/asciidoc/acid-semantics.adoc new file mode 100644 index 0000000..2df559a --- /dev/null +++ b/src/main/site/asciidoc/acid-semantics.adoc @@ -0,0 +1,114 @@ +//// + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +//// + += Apache HBase (TM) ACID Properties + +== About this Document + +Apache HBase (TM) is not an ACID compliant database. However, it does guarantee certain specific properties. + +This specification enumerates the ACID properties of HBase. + +== Definitions + +For the sake of common vocabulary, we define the following terms: +Atomicity:: + An operation is atomic if it either completes entirely or not at all. + +Consistency:: + All actions cause the table to transition from one valid state directly to another (eg a row will not disappear during an update, etc). + +Isolation:: + an operation is isolated if it appears to complete independently of any other concurrent transaction. + +Durability:: + Any update that reports "successful" to the client will not be lost. + +Visibility:: + An update is considered visible if any subsequent read will see the update as having been committed. + + +The terms _must_ and _may_ are used as specified by link:[RFC 2119]. + +In short, the word "must" implies that, if some case exists where the statement is not true, it is a bug. The word _may_ implies that, even if the guarantee is provided in a current release, users should not rely on it. + +== APIs to Consider +- Read APIs +* get +* scan +- Write APIs +* put +* batch put +* delete +- Combination (read-modify-write) APIs +* incrementColumnValue +* checkAndPut + +== Guarantees Provided + +.Atomicity +. All mutations are atomic within a row. Any put will either wholely succeed or wholely fail.footnoteref[Puts will either wholely succeed or wholely fail, provided that they are actually sent to the RegionServer. If the writebuffer is used, Puts will not be sent until the writebuffer is filled or it is explicitly flushed.] +.. An operation that returns a _success_ code has completely succeeded. +.. An operation that returns a _failure_ code has completely failed. +.. An operation that times out may have succeeded and may have failed. However, it will not have partially succeeded or failed. +. This is true even if the mutation crosses multiple column families within a row. +. APIs that mutate several rows will _not_ be atomic across the multiple rows. For example, a multiput that operates on rows 'a','b', and 'c' may return having mutated some but not all of the rows. In such cases, these APIs will return a list of success codes, each of which may be succeeded, failed, or timed out as described above. +. The checkAndPut API happens atomically like the typical _compareAndSet (CAS)_ operation found in many hardware architectures. +. The order of mutations is seen to happen in a well-defined order for each row, with no interleaving. For example, if one writer issues the mutation `a=1,b=1,c=1` and another writer issues the mutation `a=2,b=2,c=`, the row must either be `a=1,b=1,c=1` or `a=2,b=2,c=2` and must *not* be something like `a=1,b=2,c=1`. + +NOTE:This is not true _across rows_ for multirow batch mutations. + +== Consistency and Isolation +. All rows returned via any access API will consist of a complete row that existed at some point in the table's history. +. This is true across column families - i.e a get of a full row that occurs concurrent with some mutations 1,2,3,4,5 will return a complete row that existed at some point in time between mutation i and i+1 for some i between 1 and 5. +. The state of a row will only move forward through the history of edits to it. + +== Consistency of Scans +A scan is *not* a consistent view of a table. Scans do *not* exhibit _snapshot isolation_. + +Rather, scans have the following properties: +. Any row returned by the scan will be a consistent view (i.e. that version of the complete row existed at some point in time)footnoteref[consistency,A consistent view is not guaranteed intra-row scanning -- i.e. fetching a portion of a row in one RPC then going back to fetch another portion of the row in a subsequent RPC. Intra-row scanning happens when you set a limit on how many values to return per Scan#next (See link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setBatch(int)"[Scan#setBatch(int)]).] +. A scan will always reflect a view of the data _at least as new as_ the beginning of the scan. This satisfies the visibility guarantees enumerated below. +.. For example, if client A writes data X and then communicates via a side channel to client B, any scans started by client B will contain data at least as new as X. +.. A scan _must_ reflect all mutations committed prior to the construction of the scanner, and _may_ reflect some mutations committed subsequent to the construction of the scanner. +.. Scans must include _all_ data written prior to the scan (except in the case where data is subsequently mutated, in which case it _may_ reflect the mutation) + +Those familiar with relational databases will recognize this isolation level as "read committed". + +NOTE: The guarantees listed above regarding scanner consistency are referring to "transaction commit time", not the "timestamp" field of each cell. That is to say, a scanner started at time _t_ may see edits with a timestamp value greater than _t_, if those edits were committed with a "forward dated" timestamp before the scanner was constructed. + +== Visibility + +. When a client receives a "success" response for any mutation, that mutation is immediately visible to both that client and any client with whom it later communicates through side channels.footnoteref[consistency] +. A row must never exhibit so-called "time-travel" properties. That is to say, if a series of mutations moves a row sequentially through a series of states, any sequence of concurrent reads will return a subsequence of those states. + +For example, if a row's cells are mutated using the `incrementColumnValue` API, a client must never see the value of any cell decrease. + +This is true regardless of which read API is used to read back the mutation. +. Any version of a cell that has been returned to a read operation is guaranteed to be durably stored. + +== Durability +. All visible data is also durable data. That is to say, a read will never return data that has not been made durable on disk.footnoteref[durability,In the context of Apache HBase, _durably on disk_; implies an `hflush()` call on the transaction log. This does not actually imply an `fsync()` to magnetic media, but rather just that the data has been written to the OS cache on all replicas of the log. In the case of a full datacenter power loss, it is possible that the edits are not truly durable.] +. Any operation that returns a "success" code (eg does not throw an exception) will be made durable.footnoteref[durability] +. Any operation that returns a "failure" code will not be made durable (subject to the Atomicity guarantees above). +. All reasonable failure scenarios will not affect any of the guarantees of this document. + +== Tunability + +All of the above guarantees must be possible within Apache HBase. For users who would like to trade off some guarantees for performance, HBase may offer several tuning options. For example: + +* Visibility may be tuned on a per-read basis to allow stale reads or time travel. +* Durability may be tuned to only flush data to disk on a periodic basis. + +== More Information + +For more information, see the link:book.html#client[client architecture] and link:book.html#datamodel[data model] sections in the Apache HBase Reference Guide. http://git-wip-us.apache.org/repos/asf/hbase/blob/cb77a925/src/main/site/asciidoc/bulk-loads.adoc ---------------------------------------------------------------------- diff --git a/src/main/site/asciidoc/bulk-loads.adoc b/src/main/site/asciidoc/bulk-loads.adoc new file mode 100644 index 0000000..4dc463d --- /dev/null +++ b/src/main/site/asciidoc/bulk-loads.adoc @@ -0,0 +1,19 @@ +//// + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +//// + += Bulk Loads in Apache HBase (TM) + +This page has been retired. The contents have been moved to the link:book.html#arch.bulk.load[Bulk Loading] section in the Reference Guide. + http://git-wip-us.apache.org/repos/asf/hbase/blob/cb77a925/src/main/site/asciidoc/cygwin.adoc ---------------------------------------------------------------------- diff --git a/src/main/site/asciidoc/cygwin.adoc b/src/main/site/asciidoc/cygwin.adoc new file mode 100644 index 0000000..bbb4b34 --- /dev/null +++ b/src/main/site/asciidoc/cygwin.adoc @@ -0,0 +1,193 @@ +//// + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +//// + + +== Installing Apache HBase (TM) on Windows using Cygwin + +== Introduction + +link:http://hbase.apache.org[Apache HBase (TM)] is a distributed, column-oriented store, modeled after Google's link:http://research.google.com/archive/bigtable.html[BigTable]. Apache HBase is built on top of link:http://hadoop.apache.org[Hadoop] for its link:http://hadoop.apache.org/mapreduce[MapReduce] link:http://hadoop.apache.org/hdfs[distributed file system] implementations. All these projects are open-source and part of the link:http://www.apache.org[Apache Software Foundation]. + +== Purpose + +This document explains the *intricacies* of running Apache HBase on Windows using Cygwin* as an all-in-one single-node installation for testing and development. The HBase link:http://hbase.apache.org/apidocs/overview-summary.html#overview_description[Overview] and link:book.html#getting_started[QuickStart] guides on the other hand go a long way in explaning how to setup link:http://hadoop.apache.org/hbase[HBase] in more complex deployment scenarios. + +== Installation + +For running Apache HBase on Windows, 3 technologies are required: +* Java +* Cygwin +* SSH + +The following paragraphs detail the installation of each of the aforementioned technologies. + +=== Java + +HBase depends on the link:http://java.sun.com/javase/6/[Java Platform, Standard Edition, 6 Release]. So the target system has to be provided with at least the Java Runtime Environment (JRE); however if the system will also be used for development, the Jave Development Kit (JDK) is preferred. You can download the latest versions for both from link:http://java.sun.com/javase/downloads/index.jsp[Sun's download page]. Installation is a simple GUI wizard that guides you through the process. + +=== Cygwin + +Cygwin is probably the oddest technology in this solution stack. It provides a dynamic link library that emulates most of a *nix environment on Windows. On top of that a whole bunch of the most common *nix tools are supplied. Combined, the DLL with the tools form a very *nix-alike environment on Windows. + +For installation, Cygwin provides the link:http://cygwin.com/setup.exe[`setup.exe` utility] that tracks the versions of all installed components on the target system and provides the mechanism for installing or updating everything from the mirror sites of Cygwin. + +To support installation, the `setup.exe` utility uses 2 directories on the target system. The *Root* directory for Cygwin (defaults to _C:\cygwin)_ which will become _/_ within the eventual Cygwin installation; and the *Local Package* directory (e.g. _C:\cygsetup_ that is the cache where `setup.exe`stores the packages before they are installed. The cache must not be the same folder as the Cygwin root. + +Perform following steps to install Cygwin, which are elaboratly detailed in the link:http://cygwin.com/cygwin-ug-net/setup-net.html[2nd chapter] of the link:http://cygwin.com/cygwin-ug-net/cygwin-ug-net.html[Cygwin User's Guide]. + +. Make sure you have `Administrator` privileges on the target system. +. Choose and create you Root and *Local Package* directories. A good suggestion is to use `C:\cygwin\root` and `C:\cygwin\setup` folders. +. Download the `setup.exe` utility and save it to the *Local Package* directory. Run the `setup.exe` utility. +.. Choose the `Install from Internet` option. +.. Choose your *Root* and *Local Package* folders. +.. Select an appropriate mirror. +.. Don't select any additional packages yet, as we only want to install Cygwin for now. +.. Wait for download and install. +.. Finish the installation. +. Optionally, you can now also add a shortcut to your Start menu pointing to the `setup.exe` utility in the *Local Package *folder. +. Add `CYGWIN_HOME` system-wide environment variable that points to your *Root* directory. +. Add `%CYGWIN_HOME%\bin` to the end of your `PATH` environment variable. +. Reboot the sytem after making changes to the environment variables otherwise the OS will not be able to find the Cygwin utilities. +. Test your installation by running your freshly created shortcuts or the `Cygwin.bat` command in the *Root* folder. You should end up in a terminal window that is running a link:http://www.gnu.org/software/bash/manual/bashref.html[Bash shell]. Test the shell by issuing following commands: +.. `cd /` should take you to thr *Root* directory in Cygwin. +.. The `LS` commands that should list all files and folders in the current directory. +.. Use the `exit` command to end the terminal. +. When needed, to *uninstall* Cygwin you can simply delete the *Root* and *Local Package* directory, and the *shortcuts* that were created during installation. + +=== SSH + +HBase (and Hadoop) rely on link:http://nl.wikipedia.org/wiki/Secure_Shell[*SSH*] for interprocess/-node *communication* and launching* remote commands*. SSH will be provisioned on the target system via Cygwin, which supports running Cygwin programs as *Windows services*! + +. Rerun the `*setup.exe*`* utility*. +. Leave all parameters as is, skipping through the wizard using the `Next` button until the `Select Packages` panel is shown. +. Maximize the window and click the `View` button to toggle to the list view, which is ordered alfabetically on `Package`, making it easier to find the packages we'll need. +. Select the following packages by clicking the status word (normally `Skip`) so it's marked for installation. Use the `Next `button to download and install the packages. +.. `OpenSSH` +.. `tcp_wrappers` +.. `diffutils` +.. `zlib` +. Wait for the install to complete and finish the installation. + +=== HBase + +Download the *latest release* of Apache HBase from link:http://www.apache.org/dyn/closer.cgi/hbase/. As the Apache HBase distributable is just a zipped archive, installation is as simple as unpacking the archive so it ends up in its final *installation* directory. Notice that HBase has to be installed in Cygwin and a good directory suggestion is to use `/usr/local/` (or [`*Root* directory]\usr\local` in Windows slang). You should end up with a `/usr/local/hbase-_versi` installation in Cygwin. + +This finishes installation. We go on with the configuration. + +== Configuration + +There are 3 parts left to configure: *Java, SSH and HBase* itself. Following paragraphs explain eacht topic in detail. + +=== Java + +One important thing to remember in shell scripting in general (i.e. *nix and Windows) is that managing, manipulating and assembling path names that contains spaces can be very hard, due to the need to escape and quote those characters and strings. So we try to stay away from spaces in path names. *nix environments can help us out here very easily by using *symbolic links*. + +. Create a link in `/usr/local` to the Java home directory by using the following command and substituting the name of your chosen Java environment: + +---- +LN -s /cygdrive/c/Program\ Files/Java/*_jre name_*/usr/local/*_jre name_* +---- +. Test your java installation by changing directories to your Java folder `CD /usr/local/_jre name_` and issueing the command `./bin/java -version`. This should output your version of the chosen JRE. + +=== SSH + +Configuring *SSH *is quite elaborate, but primarily a question of launching it by default as a* Windows service*. + +. On Windows Vista and above make sure you run the Cygwin shell with *elevated privileges*, by right-clicking on the shortcut an using `Run as Administrator`. +. First of all, we have to make sure the *rights on some crucial files* are correct. Use the commands underneath. You can verify all rights by using the `LS -L` command on the different files. Also, notice the auto-completion feature in the shell using `TAB` is extremely handy in these situations. +.. `chmod +r /etc/passwd` to make the passwords file readable for all +.. `chmod u+w /etc/passwd` to make the passwords file writable for the owner +.. `chmod +r /etc/group` to make the groups file readable for all +.. `chmod u+w /etc/group` to make the groups file writable for the owner +.. `chmod 755 /var` to make the var folder writable to owner and readable and executable to all +. Edit the */etc/hosts.allow* file using your favorite editor (why not VI in the shell!) and make sure the following two lines are in there before the `PARANOID` line: + +---- +ALL : localhost 127.0.0.1/32 : allow +ALL : [::1]/128 : allow +---- +. Next we have to *configure SSH* by using the script `ssh-host-config`. +.. If this script asks to overwrite an existing `/etc/ssh_config`, answer `yes`. +.. If this script asks to overwrite an existing `/etc/sshd_config`, answer `yes`. +.. If this script asks to use privilege separation, answer `yes`. +.. If this script asks to install `sshd` as a service, answer `yes`. Make sure you started your shell as Adminstrator! +.. If this script asks for the CYGWIN value, just `enter` as the default is `ntsec`. +.. If this script asks to create the `sshd` account, answer `yes`. +.. If this script asks to use a different user name as service account, answer `no` as the default will suffice. +.. If this script asks to create the `cyg_server` account, answer `yes`. Enter a password for the account. +. *Start the SSH service* using `net start sshd` or `cygrunsrv --start sshd`. Notice that `cygrunsrv` is the utility that make the process run as a Windows service. Confirm that you see a message stating that `the CYGWIN sshd service was started succesfully.` +. Harmonize Windows and Cygwin* user account* by using the commands: + +---- +mkpasswd -cl > /etc/passwd +mkgroup --local > /etc/group +---- +. Test *the installation of SSH: +.. Open a new Cygwin terminal. +.. Use the command `whoami` to verify your userID. +.. Issue an `ssh localhost` to connect to the system itself. +.. Answer `yes` when presented with the server's fingerprint. +.. Issue your password when prompted. +.. Test a few commands in the remote session +.. The `exit` command should take you back to your first shell in Cygwin. +. `Exit` should terminate the Cygwin shell. + +=== HBase + +If all previous configurations are working properly, we just need some tinkering at the *HBase config* files to properly resolve on Windows/Cygwin. All files and paths referenced here start from the HBase `[*installation* directory]` as working directory. + +. HBase uses the `./conf/*hbase-env.sh*` to configure its dependencies on the runtime environment. Copy and uncomment following lines just underneath their original, change them to fit your environemnt. They should read something like: + +---- +export JAVA_HOME=/usr/local/_jre name_ +export HBASE_IDENT_STRING=$HOSTNAME +---- +. HBase uses the _./conf/`*hbase-default.xml*`_ file for configuration. Some properties do not resolve to existing directories because the JVM runs on Windows. This is the major issue to keep in mind when working with Cygwin: within the shell all paths are *nix-alike, hence relative to the root `/`. However, every parameter that is to be consumed within the windows processes themself, need to be Windows settings, hence `C:\`-alike. Change following propeties in the configuration file, adjusting paths where necessary to conform with your own installation: +.. `hbase.rootdir` must read e.g. `file:///C:/cygwin/root/tmp/hbase/data` +.. `hbase.tmp.dir` must read `C:/cygwin/root/tmp/hbase/tmp` +.. `hbase.zookeeper.quorum` must read `127.0.0.1` because for some reason `localhost` doesn't seem to resolve properly on Cygwin. +. Make sure the configured `hbase.rootdir` and `hbase.tmp.dir` *directories exist* and have the proper* rights* set up e.g. by issuing a `chmod 777` on them. + +== Testing + +This should conclude the installation and configuration of Apache HBase on Windows using Cygwin. So it's time *to test it*. + +. Start a Cygwin* terminal*, if you haven't already. +. Change directory to HBase *installation* using `CD /usr/local/hbase-_version_`, preferably using auto-completion. +. *Start HBase* using the command `./bin/start-hbase.sh` +.. When prompted to accept the SSH fingerprint, answer `yes`. +.. When prompted, provide your password. Maybe multiple times. +.. When the command completes, the HBase server should have started. +.. However, to be absolutely certain, check the logs in the `./logs` directory for any exceptions. +. Next we *start the HBase shell* using the command `./bin/hbase shell` +. We run some simple *test commands* +.. Create a simple table using command `create 'test', 'data'` +.. Verify the table exists using the command `list` +.. Insert data into the table using e.g. + +---- +put 'test', 'row1', 'data:1', 'value1' +put 'test', 'row2', 'data:2', 'value2' +put 'test', 'row3', 'data:3', 'value3' +---- +.. List all rows in the table using the command `scan 'test'` that should list all the rows previously inserted. Notice how 3 new columns where added without changing the schema! +.. Finally we get rid of the table by issuing `disable 'test'` followed by `drop 'test'` and verified by `list` which should give an empty listing. +. *Leave the shell* by `exit` +. To *stop the HBase server* issue the `./bin/stop-hbase.sh` command. And wait for it to complete!!! Killing the process might corrupt your data on disk. +. In case of *problems*, +.. Verify the HBase logs in the `./logs` directory. +.. Try to fix the problem +.. Get help on the forums or IRC (`#hbase@freenode.net`). People are very active and keen to help out! +.. Stop and retest the server. + +== Conclusion + +Now your *HBase *server is running, *start coding* and build that next killer app on this particular, but scalable datastore! + http://git-wip-us.apache.org/repos/asf/hbase/blob/cb77a925/src/main/site/asciidoc/export_control.adoc ---------------------------------------------------------------------- diff --git a/src/main/site/asciidoc/export_control.adoc b/src/main/site/asciidoc/export_control.adoc new file mode 100644 index 0000000..4a4b2ae --- /dev/null +++ b/src/main/site/asciidoc/export_control.adoc @@ -0,0 +1,40 @@ +//// + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +//// + + += Export Control + +This distribution uses or includes cryptographic software. The country in +which you currently reside may have restrictions on the import, possession, +use, and/or re-export to another country, of encryption software. BEFORE +using any encryption software, please check your country's laws, regulations +and policies concerning the import, possession, or use, and re-export of +encryption software, to see if this is permitted. See the +link:http://www.wassenaar.org/[Wassenaar Arrangement] for more +information. + +The U.S. Government Department of Commerce, Bureau of Industry and Security +(BIS), has classified this software as Export Commodity Control Number (ECCN) +5D002.C.1, which includes information security software using or performing +cryptographic functions with asymmetric algorithms. The form and manner of this +Apache Software Foundation distribution makes it eligible for export under the +License Exception ENC Technology Software Unrestricted (TSU) exception (see the +BIS Export Administration Regulations, Section 740.13) for both object code and +source code. + +Apache HBase uses the built-in java cryptography libraries. See Oracle's +information regarding +link:http://www.oracle.com/us/products/export/export-regulations-345813.html[Java cryptographic export regulations] +for more details. \ No newline at end of file http://git-wip-us.apache.org/repos/asf/hbase/blob/cb77a925/src/main/site/asciidoc/index.adoc ---------------------------------------------------------------------- diff --git a/src/main/site/asciidoc/index.adoc b/src/main/site/asciidoc/index.adoc new file mode 100644 index 0000000..ac50fc8 --- /dev/null +++ b/src/main/site/asciidoc/index.adoc @@ -0,0 +1,71 @@ +//// + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +//// + += Apache HBase™ Home + +.Welcome to Apache HBase(TM) +link:http://www.apache.org/[Apache HBase(TM)] is the link:http://hadoop.apache.org[Hadoop] database, a distributed, scalable, big data store. + +.When Would I Use Apache HBase? +Use Apache HBase when you need random, realtime read/write access to your Big Data. + +This project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware. + +Apache HBase is an open-source, distributed, versioned, non-relational database modeled after Google's link:http://research.google.com/archive/bigtable.html[Bigtable: A Distributed Storage System for Structured Data] by Chang et al. + +Just as Bigtable leverages the distributed data storage provided by the Google File System, Apache HBase provides Bigtable-like capabilities on top of Hadoop and HDFS. + +.Features +- Linear and modular scalability. +- Strictly consistent reads and writes. +- Automatic and configurable sharding of tables +- Automatic failover support between RegionServers. +- Convenient base classes for backing Hadoop MapReduce jobs with Apache HBase tables. +- Easy to use Java API for client access. +- Block cache and Bloom Filters for real-time queries. +- Query predicate push down via server side Filters +- Thrift gateway and a REST-ful Web service that supports XML, Protobuf, and binary data encoding options +- Extensible jruby-based (JIRB) shell +- Support for exporting metrics via the Hadoop metrics subsystem to files or Ganglia; or via JMX + +.Where Can I Get More Information? +See the link:book.html#arch.overview[Architecture Overview], the link:book.html#faq[FAQ] and the other documentation links at the top! + +.Export Control +The HBase distribution includes cryptographic software. See the link:export_control.html[export control notice]. + +== News +Feb 17, 2015:: link:http://www.meetup.com/hbaseusergroup/events/219260093/[HBase meetup around Strata+Hadoop World] in San Jose + +January 15th, 2015:: link:http://www.meetup.com/hbaseusergroup/events/218744798/[HBase meetup @ AppDynamics] in San Francisco + +November 20th, 2014:: link:http://www.meetup.com/hbaseusergroup/events/205219992/[HBase meetup @ WANdisco] in San Ramon + +October 27th, 2014:: link:http://www.meetup.com/hbaseusergroup/events/207386102/[HBase Meetup @ Apple] in Cupertino + +October 15th, 2014:: link:http://www.meetup.com/HBase-NYC/events/207655552[HBase Meetup @ Google] on the night before Strata/HW in NYC + +September 25th, 2014:: link:http://www.meetup.com/hbaseusergroup/events/203173692/[HBase Meetup @ Continuuity] in Palo Alto + +August 28th, 2014:: link:http://www.meetup.com/hbaseusergroup/events/197773762/[HBase Meetup @ Sift Science] in San Francisco + +July 17th, 2014:: link:http://www.meetup.com/hbaseusergroup/events/190994082/[HBase Meetup @ HP] in Sunnyvale + +June 5th, 2014:: link:http://www.meetup.com/Hadoop-Summit-Community-San-Jose/events/179081342/[HBase BOF at Hadoop Summit], San Jose Convention Center + +May 5th, 2014:: link:http://www.hbasecon.com[HBaseCon2014] at the Hilton San Francisco on Union Square + +March 12th, 2014:: link:http://www.meetup.com/hbaseusergroup/events/160757912/[HBase Meetup @ Ancestry.com] in San Francisco + +View link:old_news.html[Old News] http://git-wip-us.apache.org/repos/asf/hbase/blob/cb77a925/src/main/site/asciidoc/metrics.adoc ---------------------------------------------------------------------- diff --git a/src/main/site/asciidoc/metrics.adoc b/src/main/site/asciidoc/metrics.adoc new file mode 100644 index 0000000..91b26af --- /dev/null +++ b/src/main/site/asciidoc/metrics.adoc @@ -0,0 +1,97 @@ +//// + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +//// + += Apache HBase (TM) Metrics + +== Introduction +Apache HBase (TM) emits Hadoop link:http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/metrics/package-summary.html[metrics]. + +== Setup + +First read up on Hadoop link:http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/metrics/package-summary.html[metrics]. + +If you are using ganglia, the link:http://wiki.apache.org/hadoop/GangliaMetrics[GangliaMetrics] wiki page is useful read. + +To have HBase emit metrics, edit `$HBASE_HOME/conf/hadoop-metrics.properties` and enable metric 'contexts' per plugin. As of this writing, hadoop supports *file* and *ganglia* plugins. Yes, the hbase metrics files is named hadoop-metrics rather than _hbase-metrics_ because currently at least the hadoop metrics system has the properties filename hardcoded. Per metrics _context_, comment out the NullContext and enable one or more plugins instead. + +If you enable the _hbase_ context, on regionservers you'll see total requests since last +metric emission, count of regions and storefiles as well as a count of memstore size. +On the master, you'll see a count of the cluster's requests. + +Enabling the _rpc_ context is good if you are interested in seeing +metrics on each hbase rpc method invocation (counts and time taken). + +The _jvm_ context is useful for long-term stats on running hbase jvms -- memory used, thread counts, etc. As of this writing, if more than one jvm is running emitting metrics, at least in ganglia, the stats are aggregated rather than reported per instance. + +== Using with JMX + +In addition to the standard output contexts supported by the Hadoop +metrics package, you can also export HBase metrics via Java Management +Extensions (JMX). This will allow viewing HBase stats in JConsole or +any other JMX client. + +=== Enable HBase stats collection + +To enable JMX support in HBase, first edit `$HBASE_HOME/conf/hadoop-metrics.properties` to support metrics refreshing. (If you've running 0.94.1 and above, or have already configured `hadoop-metrics.properties` for another output context, you can skip this step). +[source,bash] +---- +# Configuration of the "hbase" context for null +hbase.class=org.apache.hadoop.metrics.spi.NullContextWithUpdateThread +hbase.period=60 + +# Configuration of the "jvm" context for null +jvm.class=org.apache.hadoop.metrics.spi.NullContextWithUpdateThread +jvm.period=60 + +# Configuration of the "rpc" context for null +rpc.class=org.apache.hadoop.metrics.spi.NullContextWithUpdateThread +rpc.period=60 +---- + +=== Setup JMX Remote Access + +For remote access, you will need to configure JMX remote passwords and access profiles. Create the files: +`$HBASE_HOME/conf/jmxremote.passwd` (set permissions + to 600):: + +---- +monitorRole monitorpass +controlRole controlpass +---- + +`$HBASE_HOME/conf/jmxremote.access`:: + +---- +monitorRole readonly +controlRole readwrite +---- + +=== Configure JMX in HBase startup + +Finally, edit the `$HBASE_HOME/conf/hbase-env.sh` script to add JMX support: +[source,bash] +---- +HBASE_JMX_OPTS="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.ssl=false" +HBASE_JMX_OPTS="$HBASE_JMX_OPTS -Dcom.sun.management.jmxremote.password.file=$HBASE_HOME/conf/jmxremote.passwd" +HBASE_JMX_OPTS="$HBASE_JMX_OPTS -Dcom.sun.management.jmxremote.access.file=$HBASE_HOME/conf/jmxremote.access" + +export HBASE_MASTER_OPTS="$HBASE_JMX_OPTS -Dcom.sun.management.jmxremote.port=10101" +export HBASE_REGIONSERVER_OPTS="$HBASE_JMX_OPTS -Dcom.sun.management.jmxremote.port=10102" +---- + +After restarting the processes you want to monitor, you should now be able to run JConsole (included with the JDK since JDK 5.0) to view the statistics via JMX. HBase MBeans are exported under the *`hadoop`* domain in JMX. + + +== Understanding HBase Metrics + +For more information on understanding HBase metrics, see the link:book.html#hbase_metrics[metrics section] in the Apache HBase Reference Guide. + http://git-wip-us.apache.org/repos/asf/hbase/blob/cb77a925/src/main/site/asciidoc/old_news.adoc ---------------------------------------------------------------------- diff --git a/src/main/site/asciidoc/old_news.adoc b/src/main/site/asciidoc/old_news.adoc new file mode 100644 index 0000000..36ff510 --- /dev/null +++ b/src/main/site/asciidoc/old_news.adoc @@ -0,0 +1,117 @@ +//// + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +//// + += Old Apache HBase (TM) News + +February 10th, 2014:: link:http://www.meetup.com/hbaseusergroup/events/163139322/[HBase Meetup @ Continuuity] in Palo Alto + +January 30th, 2014:: link:http://www.meetup.com/hbaseusergroup/events/158491762/[HBase Meetup @ Apple] in Cupertino + +January 30th, 2014:: link:http://www.meetup.com/Los-Angeles-HBase-User-group/events/160560282/[Los Angeles HBase User Group] in El Segundo + +October 24th, 2013:: link:http://www.meetup.com/hbaseusergroup/events/140759692/[HBase User] and link:http://www.meetup.com/hackathon/events/144366512/[Developer] Meetup at HortonWorksin Palo Alto + +September 26, 2013:: link:http://www.meetup.com/hbaseusergroup/events/135862292/[HBase Meetup at Arista Networks] in San Francisco + +August 20th, 2013:: link:http://www.meetup.com/hbaseusergroup/events/120534362/[HBase Meetup at Flurry] in San Francisco + +July 16th, 2013:: link:http://www.meetup.com/hbaseusergroup/events/119929152/[HBase Meetup at Twitter] in San Francisco + +June 25th, 2013:: link:http://www.meetup.com/hbaseusergroup/events/119154442/[Hadoop Summit Meetup].at San Jose Convention Center + +June 14th, 2013:: link:http://kijicon.eventbrite.com/[KijiCon: Building Big Data Apps] in San Francisco. + +June 13th, 2013:: link:http://www.hbasecon.com/[HBaseCon2013] in San Francisco. Submit an Abstract! + +June 12th, 2013:: link:http://www.meetup.com/hackathon/events/123403802/[HBaseConHackAthon] at the Cloudera office in San Francisco. + +April 11th, 2013:: link:http://www.meetup.com/hbaseusergroup/events/103587852/[HBase Meetup at AdRoll] in San Francisco + +February 28th, 2013:: link:http://www.meetup.com/hbaseusergroup/events/96584102/[HBase Meetup at Intel Mission Campus] + +February 19th, 2013:: link:http://www.meetup.com/hackathon/events/103633042/[Developers PowWow] at HortonWorks' new digs + +January 23rd, 2013:: link:http://www.meetup.com/hbaseusergroup/events/91381312/[HBase Meetup at WibiData World HQ!] + +December 4th, 2012:: link:http://www.meetup.com/hackathon/events/90536432/[0.96 Bug Squashing and Testing Hackathon] at Cloudera, SF. + +October 29th, 2012:: link:http://www.meetup.com/hbaseusergroup/events/82791572/[HBase User Group Meetup] at Wize Commerce in San Mateo. + +October 25th, 2012:: link:http://www.meetup.com/HBase-NYC/events/81728932/[Strata/Hadoop World HBase Meetup.] in NYC + +September 11th, 2012:: link:http://www.meetup.com/hbaseusergroup/events/80621872/[Contributor's Pow-Wow at HortonWorks HQ.] + +August 8th, 2012:: link:http://www.apache.org/dyn/closer.cgi/hbase/[Apache HBase 0.94.1 is available for download] + +June 15th, 2012:: link:http://www.meetup.com/hbaseusergroup/events/59829652/[Birds-of-a-feather] in San Jose, day after:: link:http://hadoopsummit.org[Hadoop Summit] + +May 23rd, 2012:: link:http://www.meetup.com/hackathon/events/58953522/[HackConAthon] in Palo Alto + +May 22nd, 2012:: link:http://www.hbasecon.com[HBaseCon2012] in San Francisco + +March 27th, 2012:: link:http://www.meetup.com/hbaseusergroup/events/56021562/[Meetup @ StumbleUpon] in San Francisco + +January 19th, 2012:: link:http://www.meetup.com/hbaseusergroup/events/46702842/[Meetup @ EBay] + +January 23rd, 2012:: Apache HBase 0.92.0 released. link:http://www.apache.org/dyn/closer.cgi/hbase/[Download it!] + +December 23rd, 2011:: Apache HBase 0.90.5 released. link:http://www.apache.org/dyn/closer.cgi/hbase/[Download it!] + +November 29th, 2011:: link:http://www.meetup.com/hackathon/events/41025972/[Developer Pow-Wow in SF] at Salesforce HQ + +November 7th, 2011:: link:http://www.meetup.com/hbaseusergroup/events/35682812/[HBase Meetup in NYC (6PM)] at the AppNexus office + +August 22nd, 2011:: link:http://www.meetup.com/hbaseusergroup/events/28518471/[HBase Hackathon (11AM) and Meetup (6PM)] at FB in PA + +June 30th, 2011:: link:http://www.meetup.com/hbaseusergroup/events/20572251/[HBase Contributor Day], the day after the:: link:http://developer.yahoo.com/events/hadoopsummit2011/[Hadoop Summit] hosted by Y! + +June 8th, 2011:: link:http://berlinbuzzwords.de/wiki/hbase-workshop-and-hackathon[HBase Hackathon] in Berlin to coincide with:: link:http://berlinbuzzwords.de/[Berlin Buzzwords] + +May 19th, 2011: Apache HBase 0.90.3 released. link:http://www.apache.org/dyn/closer.cgi/hbase/[Download it!] + +April 12th, 2011: Apache HBase 0.90.2 released. link:http://www.apache.org/dyn/closer.cgi/hbase/[Download it!] + +March 21st, 2011:: link:http://www.meetup.com/hackathon/events/16770852/[HBase 0.92 Hackathon at StumbleUpon, SF] +February 22nd, 2011:: link:http://www.meetup.com/hbaseusergroup/events/16492913/[HUG12: February HBase User Group at StumbleUpon SF] +December 13th, 2010:: link:http://www.meetup.com/hackathon/calendar/15597555/[HBase Hackathon: Coprocessor Edition] +November 19th, 2010:: link:http://huguk.org/[Hadoop HUG in London] is all about Apache HBase +November 15-19th, 2010:: link:http://www.devoxx.com/display/Devoxx2K10/Home[Devoxx] features HBase Training and multiple HBase presentations + +October 12th, 2010:: HBase-related presentations by core contributors and users at:: link:http://www.cloudera.com/company/press-center/hadoop-world-nyc/[Hadoop World 2010] + +October 11th, 2010:: link:http://www.meetup.com/hbaseusergroup/calendar/14606174/[HUG-NYC: HBase User Group NYC Edition] (Night before Hadoop World) +June 30th, 2010:: link:http://www.meetup.com/hbaseusergroup/calendar/13562846/[Apache HBase Contributor Workshop] (Day after Hadoop Summit) +May 10th, 2010:: Apache HBase graduates from Hadoop sub-project to Apache Top Level Project + +April 19, 2010:: Signup for link:http://www.meetup.com/hbaseusergroup/calendar/12689490/[HBase User Group Meeting, HUG10] hosted by Trend Micro + +March 10th, 2010:: link:http://www.meetup.com/hbaseusergroup/calendar/12689351/[HBase User Group Meeting, HUG9] hosted by Mozilla + +January 27th, 2010:: Sign up for the link:http://www.meetup.com/hbaseusergroup/calendar/12241393/[HBase User Group Meeting, HUG8], at StumbleUpon in SF + +September 8th, 2010:: Apache HBase 0.20.0 is faster, stronger, slimmer, and sweeter tasting than any previous Apache HBase release. Get it off the link:http://www.apache.org/dyn/closer.cgi/hbase/[Releases] page. + +November 2-6th, 2009:: link:http://dev.us.apachecon.com/c/acus2009/[ApacheCon] in Oakland. The Apache Foundation will be celebrating its 10th anniversary in beautiful Oakland by the Bay. Lots of good talks and meetups including an HBase presentation by a couple of the lads. + +October 2nd, 2009:: HBase at Hadoop World in NYC. A few of us will be talking on Practical HBase out east at link:http://www.cloudera.com/hadoop-world-nyc[Hadoop World: NYC]. + +August 7th-9th, 2009:: HUG7 and HBase Hackathon at StumbleUpon in SF: Sign up for the:: link:http://www.meetup.com/hbaseusergroup/calendar/10950511/[HBase User Group Meeting, HUG7] or for the link:http://www.meetup.com/hackathon/calendar/10951718/[Hackathon] or for both (all are welcome!). + +June, 2009:: HBase at HadoopSummit2009 and at NOSQL: See the link:http://wiki.apache.org/hadoop/HBase/HBasePresentations[presentations] + +March 3rd, 2009 :: HUG6 -- link:http://www.meetup.com/hbaseusergroup/calendar/9764004/[HBase User Group 6] + +January 30th, 2009:: LA Hbackathon: link:http://www.meetup.com/hbasela/calendar/9450876/[HBase January Hackathon Los Angeles] at link:http://streamy.com[Streamy] in Manhattan Beach + http://git-wip-us.apache.org/repos/asf/hbase/blob/cb77a925/src/main/site/asciidoc/pseudo-distributed.adoc ---------------------------------------------------------------------- diff --git a/src/main/site/asciidoc/pseudo-distributed.adoc b/src/main/site/asciidoc/pseudo-distributed.adoc new file mode 100644 index 0000000..1eef753 --- /dev/null +++ b/src/main/site/asciidoc/pseudo-distributed.adoc @@ -0,0 +1,19 @@ +//// + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +//// + + += Running Apache HBase (TM) in pseudo-distributed mode +This page has been retired. The contents have been moved to the link:book.html#distributed[Distributed Operation: Pseudo- and Fully-distributed modes] section in the Reference Guide. + http://git-wip-us.apache.org/repos/asf/hbase/blob/cb77a925/src/main/site/asciidoc/replication.adoc ---------------------------------------------------------------------- diff --git a/src/main/site/asciidoc/replication.adoc b/src/main/site/asciidoc/replication.adoc new file mode 100644 index 0000000..0f41839 --- /dev/null +++ b/src/main/site/asciidoc/replication.adoc @@ -0,0 +1,18 @@ +//// + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +//// + += Apache HBase (TM) Replication + +This information has been moved to link:book.html#cluster_replication"[the Cluster Replication] section of the link:book.html[Apache HBase Reference Guide]. \ No newline at end of file http://git-wip-us.apache.org/repos/asf/hbase/blob/cb77a925/src/main/site/asciidoc/resources.adoc ---------------------------------------------------------------------- diff --git a/src/main/site/asciidoc/resources.adoc b/src/main/site/asciidoc/resources.adoc new file mode 100644 index 0000000..55af99e --- /dev/null +++ b/src/main/site/asciidoc/resources.adoc @@ -0,0 +1,22 @@ +//// + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +//// += Other Apache HBase (TM) Resources + +== Books +HBase: The Definitive Guide:: link:http://shop.oreilly.com/product/0636920014348.do[HBase: The Definitive Guide, _Random Access to Your Planet-Size Data_] by Lars George. Publisher: O'Reilly Media, Released: August 2011, Pages: 556. + +HBase In Action:: link:http://www.manning.com/dimidukkhurana[HBase In Action] By Nick Dimiduk and Amandeep Khurana. Publisher: Manning, MEAP Began: January 2012, Softbound print: Fall 2012, Pages: 350. + +HBase Administration Cookbook:: link:http://www.packtpub.com/hbase-administration-for-optimum-database-performance-cookbook/book[HBase Administration Cookbook] by Yifeng Jiang. Publisher: PACKT Publishing, Release: Expected August 2012, Pages: 335. + http://git-wip-us.apache.org/repos/asf/hbase/blob/cb77a925/src/main/site/asciidoc/sponsors.adoc ---------------------------------------------------------------------- diff --git a/src/main/site/asciidoc/sponsors.adoc b/src/main/site/asciidoc/sponsors.adoc new file mode 100644 index 0000000..56046d8 --- /dev/null +++ b/src/main/site/asciidoc/sponsors.adoc @@ -0,0 +1,30 @@ +//// + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +//// + += Apache HBase(TM) Sponsors + +First off, thanks to link:http://www.apache.org/foundation/thanks.html[all who sponsor] our parent, the Apache Software Foundation. + +The below companies have been gracious enough to provide their commerical tool offerings free of charge to the Apache HBase(TM) project. + +* The crew at link:http://www.ej-technologies.com/[ej-technologies] have been letting us use link:http://www.ej-technologies.com/products/jprofiler/overview.html[JProfiler] for years now. + +* The lads at link:http://headwaysoftware.com/[headway software] have given us a license for link:http://headwaysoftware.com/products/?code=Restructure101[Restructure101] so we can untangle our interdependency mess. + +* link:http://www.yourkit.com[YourKit] allows us to use their link:http://www.yourkit.com/overview/index.jsp[Java Profiler]. +* Some of us use link:http://www.jetbrains.com/idea[IntelliJ IDEA] thanks to link:http://www.jetbrains.com/[JetBrains]. + +== Sponsoring the Apache Software Foundation"> +To contribute to the Apache Software Foundation, a good idea in our opinion, see the link:http://www.apache.org/foundation/sponsorship.html[ASF Sponsorship] page. + http://git-wip-us.apache.org/repos/asf/hbase/blob/cb77a925/src/main/site/resources/.htaccess ---------------------------------------------------------------------- diff --git a/src/main/site/resources/.htaccess b/src/main/site/resources/.htaccess new file mode 100644 index 0000000..20bf651 --- /dev/null +++ b/src/main/site/resources/.htaccess @@ -0,0 +1,8 @@ + +# Redirect replication URL to the right section of the book +# Rule added 2015-1-12 -- can be removed in 6 months +Redirect permanent /replication.html /book.html#_cluster_replication + +# Redirect old page-per-chapter book sections to new single file. +RedirectMatch permanent ^/book/(.*)\.html$ /book.html#$1 +RedirectMatch permanent ^/book/$ /book.html http://git-wip-us.apache.org/repos/asf/hbase/blob/cb77a925/src/main/site/resources/book/.empty ---------------------------------------------------------------------- diff --git a/src/main/site/resources/book/.empty b/src/main/site/resources/book/.empty new file mode 100644 index 0000000..5513814 --- /dev/null +++ b/src/main/site/resources/book/.empty @@ -0,0 +1 @@ +# This directory is here so that we can have rewrite rules in our .htaccess to maintain old links. Otherwise we fall under some top-level niceness redirects because we have a file named book.html. http://git-wip-us.apache.org/repos/asf/hbase/blob/cb77a925/src/main/site/resources/css/site.css ---------------------------------------------------------------------- diff --git a/src/main/site/resources/css/site.css b/src/main/site/resources/css/site.css index f26d03c..17f0ff0 100644 --- a/src/main/site/resources/css/site.css +++ b/src/main/site/resources/css/site.css @@ -72,8 +72,10 @@ h4 { #banner { background: none; + padding: 10px; } +/* #banner img { padding: 10px; margin: auto; @@ -82,6 +84,7 @@ h4 { float: center; height:; } + */ #breadcrumbs { background-image: url(); http://git-wip-us.apache.org/repos/asf/hbase/blob/cb77a925/src/main/site/site.xml ---------------------------------------------------------------------- diff --git a/src/main/site/site.xml b/src/main/site/site.xml index 0d60e00..81c9315 100644 --- a/src/main/site/site.xml +++ b/src/main/site/site.xml @@ -19,18 +19,34 @@ */ --> - + + lt.velykis.maven.skins + reflow-maven-skin + 1.1.1 + + + + bootswatch-spacelab + + Apache HBase Project + ^Documentation + 0.94 Documentation|ASF + + + Apache HBase images/hbase_logo.png http://hbase.apache.org/ - - - + + Apache HBase Orca + images/hbasecon2015.30percent.png + http://hbasecon.com/ + @@ -47,35 +63,30 @@ - + + - - - + - - - + + + - - - + + + - - - + + + - - org.apache.maven.skins - maven-stylus-skin - http://git-wip-us.apache.org/repos/asf/hbase/blob/cb77a925/src/main/site/xdoc/index.xml ---------------------------------------------------------------------- diff --git a/src/main/site/xdoc/index.xml b/src/main/site/xdoc/index.xml index 964d887..a40ab4b 100644 --- a/src/main/site/xdoc/index.xml +++ b/src/main/site/xdoc/index.xml @@ -68,6 +68,14 @@ Apache HBase is an open-source, distributed, versioned, non-relational database

+

May 7th, 2015 HBaseCon2015 in San Francisco

+

February 17th, 2015 HBase meetup around Strata+Hadoop World in San Jose

+

January 15th, 2015 HBase meetup @ AppDynamics in San Francisco

+

November 20th, 2014 HBase meetup @ WANdisco in San Ramon

+

October 27th, 2014 HBase Meetup @ Apple in Cupertino

+

October 15th, 2014 HBase Meetup @ Google on the night before Strata/HW in NYC

+

September 25th, 2014 HBase Meetup @ Continuuity in Palo Alto

+

August 28th, 2014 HBase Meetup @ Sift Science in San Francisco

July 17th, 2014 HBase Meetup @ HP in Sunnyvale

June 5th, 2014 HBase BOF at Hadoop Summit, San Jose Convention Center

May 5th, 2014 HBaseCon2014 at the Hilton San Francisco on Union Square