Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 59310 invoked from network); 2 Apr 2010 14:31:19 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 2 Apr 2010 14:31:19 -0000 Received: (qmail 33185 invoked by uid 500); 2 Apr 2010 09:04:38 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 33094 invoked by uid 500); 2 Apr 2010 09:04:38 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 33080 invoked by uid 99); 2 Apr 2010 09:04:38 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 Apr 2010 09:04:38 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of bangzhong@gmail.com designates 209.85.223.203 as permitted sender) Received: from [209.85.223.203] (HELO mail-iw0-f203.google.com) (209.85.223.203) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 Apr 2010 09:04:30 +0000 Received: by iwn41 with SMTP id 41so1387209iwn.20 for ; Fri, 02 Apr 2010 02:04:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:received:message-id:subject:from:to:content-type; bh=7QxcROtsJU/qTOnjLaxUKmKCFbfUA2n0Gecs5UIpYc8=; b=VRq5/OECK9SuLjWwFSPFfbIyzwuQLN96FUgl8fKKHo6ByNd33tZ4NWFx73qGZckpEg bxnH55psu5YZ8UfaMVY88Bzk90Ka45ZBiR3loyaAg6XsO13ZE4o5+gaGH9PYRUXt6N+b JvGZRxllN3MvmNEuX2VGmwmA8PnI3g8fJl/Pk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=LafPHS8CkU2BOxYYLwcJz2TvD5ts7E64NmoBWl0QFrkC1t/W/sSox+wLfhfBpxwmxf Jy5Kz5/3dMSLcCSJzTtReaiz5ndmw/TBlSb+0qhM9neQy/8RelFmQIvzFqI6AFIDLEee 9phrk08Upo+tC3ASS5jyoFOV+YJpUbAXn76zg= MIME-Version: 1.0 Received: by 10.231.79.212 with HTTP; Fri, 2 Apr 2010 02:04:09 -0700 (PDT) In-Reply-To: <4BB5B1BE.7090106@ninja.co.jp> References: <4BB5B1BE.7090106@ninja.co.jp> Date: Fri, 2 Apr 2010 17:04:09 +0800 Received: by 10.231.146.2 with SMTP id f2mr716994ibv.23.1270199049708; Fri, 02 Apr 2010 02:04:09 -0700 (PDT) Message-ID: Subject: Re: hbase performance From: Chen Bangzhong To: hbase-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=0016e64c061626e3cb04833d437d X-Virus-Checked: Checked by ClamAV on apache.org --0016e64c061626e3cb04833d437d Content-Type: text/plain; charset=GB2312 Content-Transfer-Encoding: quoted-printable =D4=DA 2010=C4=EA4=D4=C22=C8=D5 =CF=C2=CE=E74:58=A3=ACJuhani Connolly =D0=B4=B5=C0=A3=BA > You're results seem very low, but your system specs are also quite > moderate. > > On 04/02/2010 04:46 PM, Chen Bangzhong wrote: > > Hi, All > > > > I am benchmarking hbase. My HDFS clusters includes 4 servers (Dell 860, > with > > 2 GB RAM). One NameNode, one JobTracker, 2 DataNodes. > > > > My HBase Cluster also comprise 4 servers too. One Master, 2 region and > one > > ZooKeeper. (Dell 860, with 2 GB RAM) > > > While I'm far from being an authority on the matter, running > datanodes+regionservers together should help performance > Try making your 2 datanodes + 2 regionservers into 4 servers running > both data/region. > I will try to run datanode and region server on the same server. > > I runned the org.apache.hadoop.PerformanceEvaluation on the ZooKeeper > > server. the ROW_LENGTH was changed from 1000 to ROW_LENGTH =3D 100*1024= ; > > So each value will be 100k in size. > > > > hadoop version is 0.20.2, hbase version is 0.20.3. dfs.replication set = to > 1. > > > Setting replication to 1 isn't going to give results that are very > indicative of a "real" application, making it questionable as a > benchmark. If you intend to run on a single replica at release, you'll > be at high risk of data loss. > Since I have only 2 data nodes, I set replication to 1. In production, it will be set to 3. > > The following is the command line: > > > > bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation --nomapred > > --rows=3D10000 randomWrite 20. > > > > It tooks about one hour to complete the test(3468628 ms), about 60 writ= es > > per second. It seems the performance is disappointing. > > > > Is there anything I can do to make hbase perform better under 100k size > =A3=BFI > > didn't try the method mentioned in the performance wiki yet, because I > > thought 60writes/sec is too low. > > > > > Do you mean *over* 100k size? > 2GB ram is pretty low and you'd likely get significantly better > performance with it, though on this scale it probably isn't a > significant problem. > the data size is exactly 100k size. > > If the value size is 1k, hbase performs much better. 200000 sequencewri= te > > tooks about 16 seconds, about 12500 writes/per second. > > > > > Comparing sequencewrite performance with randomwrite isn't a helpful > indicator. Do you have randomWrite results for 1k values? The way your > performance degrades with the size of the records seems like you may > have a bottleneck at network transfer? What's rack locality like and how > much bandwidth do you have between the servers? > > Now I am trying to benchmark using two clients on 2 servers, no result > yet. > > > > > for 1k datasize, the sequencewrite performance and randomWrite performance is about the same. All my servers are under one switch, don't know the switch bandwidth yet. > You're already running 20 clients on your first server with the > PerformanceEvaluation. Do you mean you intend to run 20 on each? > In fact, it is 20 threads on one machine. > > Hopefully someone with better knowledge can give a better answer but my > guess is that you have a network transfer transfer. Try doing further > tests with randomWrite and decreasing value sizes and see if the time > correlates to the total amount of data written. > > --0016e64c061626e3cb04833d437d--