Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 62D13EC61 for ; Sat, 2 Mar 2013 17:39:20 +0000 (UTC) Received: (qmail 38277 invoked by uid 500); 2 Mar 2013 17:39:18 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 38216 invoked by uid 500); 2 Mar 2013 17:39:18 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 38208 invoked by uid 99); 2 Mar 2013 17:39:18 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 02 Mar 2013 17:39:18 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [98.139.213.95] (HELO nm8-vm0.bullet.mail.bf1.yahoo.com) (98.139.213.95) by apache.org (qpsmtpd/0.29) with SMTP; Sat, 02 Mar 2013 17:39:10 +0000 Received: from [98.139.214.32] by nm8.bullet.mail.bf1.yahoo.com with NNFMP; 02 Mar 2013 17:38:49 -0000 Received: from [98.139.212.222] by tm15.bullet.mail.bf1.yahoo.com with NNFMP; 02 Mar 2013 17:38:49 -0000 Received: from [127.0.0.1] by omp1031.mail.bf1.yahoo.com with NNFMP; 02 Mar 2013 17:38:49 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 297972.81551.bm@omp1031.mail.bf1.yahoo.com Received: (qmail 45910 invoked by uid 60001); 2 Mar 2013 17:38:49 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1362245929; bh=Ec9qxAaWTwwZyRIGq+/0HbH07HXNcndhunGyHo+ONu4=; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-RocketYMMF:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=QL2/PGV2v7quDSXa8W1brR4TG6z16stRFXRfko4udLrwWi57JrsRy+nYGidmYqzm3GbpxKHUkEe0XsT6jNiL+xv6r5+h9dObz6StnSZbdEai39dvdBuwujZen3gS9qM7AFF9nORevgR4ULojC3ZELN1KQ26cs/DQpE53PRWABOc= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-RocketYMMF:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=XPF4wiKigtz0khJn+vWUVFVGx+8zY2MCqR+RsjqhAhwJUwEGhZ/809riGYsNf71fqpy0vlprJkimhFSH1WzvWMwlrlNV8PbegcJV1meJLeEcZoaZyQ9CfQ7jdRS/dnEm9x3JVeUerZZBiCisky8wRFinHFINrDfwFi6soSD6REk=; X-YMail-OSG: JPXiu4cVM1nZ9g6FOnUAuXPv_nBtnkTdSs7QN8GD_o4rFvE Uu1aTdta86oMgLPtY0f0KmP3nF0VRb1A0wJ7YwWa3yXx4766kUv6LAZhA5J2 guHnwMUletNv1xxYwchN4sRii_WMi2bVeFi6d5EIllFM94XdZmyO1g6ErA3S a95P71E_H_dbzm1gofmgwkO3s9.N06IQa9An3e6P1Ul3Q1xnkCuILOauG4CQ Pz7u77XIrhxdPjTQZrRO_jhLuQt5qpvAJbb3xvp2eUaD4poz2jYGq.ddepm4 KDPYcGSdeSWho.yfBxje2tEZDA8o91ePGCSqJDz569JlPb5.y1JWFR5jk120 CMn8O.tKL011FtNpWEFFUcgZJXYkINMgc8_1DpPEv33MFbsUv7Dr.qD82WhH ILKz9SPD8C9srNN8m.6c9WTzjFx6zsOfNosbbBGZNW3mlD6Vx3gfJExcZRaO ncatrlFsoAdKB9XOTU04nBJilu.MAan_mvg_w6InPtPiJvzAFUM82s6Hk_al JBfo6hxp9e8xwwhh1dpIK Received: from [24.130.114.129] by web140603.mail.bf1.yahoo.com via HTTP; Sat, 02 Mar 2013 09:38:48 PST X-Rocket-MIMEInfo: 001.001,IlRoYXQncyBvbmx5IHRydWUgZnJvbSB0aGUgSERGUyBwZXJzcGVjdGl2ZSwgcmlnaHQ_IEFueSBnaXZlbiByZWdpb24gaXMgCiJvd25lZCIgYnkgMSBvZiB0aGUgNiByZWdpb25zZXJ2ZXJzIGF0IGFueSBnaXZlbiB0aW1lLCBhbmQgd3JpdGVzIGFyZSAKYnVmZmVyZWQgdG8gbWVtb3J5IGJlZm9yZSBiZWluZyBwZXJzaXN0ZWQgdG8gSERGUywgcmlnaHQ_IgoKT25seSBpZiB5b3UgZGlzYWJsZWQgdGhlIFdBTCwgb3RoZXJ3aXNlIGVhY2ggY2hhbmdlIGlzIHdyaXR0ZW4gdG8gdGhlIFdBTCBmaXJzdCwgYW5kIHQBMAEBAQE- X-RocketYMMF: lhofhansl X-Mailer: YahooMailWebService/0.8.135.514 References: <1362195777.67783.YahooMailNeo@web140603.mail.bf1.yahoo.com> Message-ID: <1362245928.60994.YahooMailNeo@web140603.mail.bf1.yahoo.com> Date: Sat, 2 Mar 2013 09:38:48 -0800 (PST) From: lars hofhansl Reply-To: lars hofhansl Subject: Re: HBase Thrift inserts bottlenecked somewhere -- but where? To: "user@hbase.apache.org" In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="-1172831624-554427851-1362245928=:60994" X-Virus-Checked: Checked by ClamAV on apache.org ---1172831624-554427851-1362245928=:60994 Content-Type: text/plain; charset=us-ascii "That's only true from the HDFS perspective, right? Any given region is "owned" by 1 of the 6 regionservers at any given time, and writes are buffered to memory before being persisted to HDFS, right?" Only if you disabled the WAL, otherwise each change is written to the WAL first, and then committed to the memstore. So in the sense it's even worse. Each edit is written twice to the FS, replicated 3 times, and all that only 6 data nodes. 20k writes does seem a bit low. -- Lars ________________________________ From: Dan Crosta To: "user@hbase.apache.org" Sent: Saturday, March 2, 2013 9:12 AM Subject: Re: HBase Thrift inserts bottlenecked somewhere -- but where? On Mar 1, 2013, at 10:42 PM, lars hofhansl wrote: > What performance profile do you expect? That's a good question. Our configuration is actually already exceeding our minimum and desired performance thresholds, so I'm not too worried about it. My concern is more that I develop an understanding of where the bottlenecks are (e.g. it doesn't appear to be disk, CPU, or network bound at the moment), and develop an intuition for working with HBase in case we are ever under the gun. > Where does it top out (i.e. how many ops/sec)? We're doing about 20,000 writes per second sustained across 4 tables and 6 CFs. Does this sound ballpark right for 6x EC2 m1.xlarges? > Also note that each data item is replicated to three nodes (by HDFS). So in a 6 machine cluster each machine would get 50% of the writes. > If you are looking for performance you really need a larger cluster to amortize this replication cost across more machines. That's only true from the HDFS perspective, right? Any given region is "owned" by 1 of the 6 regionservers at any given time, and writes are buffered to memory before being persisted to HDFS, right? In any event, there doesn't seem to be any disk contention to speak of -- we average around 10% disk utilization at this level of load (each machine has 4 spindles of local storage, we are not using EBS). One setting no one has mentioned yet is the DataNode handler count (dfs.datanode.handler.count) -- which is currently set to its default of 3. Should we experiment with raising that? > The other issue to watch out for is whether your keys are generated such that a single regionserver is hot spotted (you can look at the operation count on the master page). All of our keys are hashes or UUIDs, so the key distribution is very smooth, and this is confirmed by the "Region Servers" table on the master node's web UI. Thanks, - Dan ---1172831624-554427851-1362245928=:60994--