hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Hammerbacher <ham...@cloudera.com>
Subject Re: HBase in a real world application
Date Mon, 17 Aug 2009 23:54:35 GMT
Hey Stack,

I notice that the patch for this issue doesn't include any sort of tests
that might have caught this regression. Do you guys have an HBaseBench,
HBaseMix, or similarly named tool for catching performance regressions?

Thanks,
Jeff

On Mon, Aug 17, 2009 at 4:51 PM, stack <stack@duboce.net> wrote:

> Our writes were off by a factor of 7 or 8.  Writes should be better now
> (HBASE-1771).
> Thanks,
> St.Ack
>
>
> On Thu, Aug 13, 2009 at 4:53 PM, stack <stack@duboce.net> wrote:
>
> > I just tried it.  It seems slow to me writing too.  Let me take a
> look....
> > St.Ack
> >
> >
> > On Thu, Aug 13, 2009 at 10:06 AM, llpind <sonny_heer@hotmail.com> wrote:
> >
> >>
> >> Okay I changed replication to 2.  and removed "-XX:NewSize=6m
> >> -XX:MaxNewSize=6m"
> >>
> >> here is results for randomWrite 3 clients:
> >>
> >>
> >>
> >> RandomWrite =================================================
> >>
> >> hadoop-0.20.0/bin/hadoop jar hbase-0.20.0/hbase-0.20.0-test.jar
> >>  --nomapred
> >> randomWrite 3
> >>
> >>
> >> 09/08/13 09:51:15 INFO hbase.PerformanceEvaluation: client-0 Start
> >> randomWrite at offset 0 for 1048576 rows
> >> 09/08/13 09:51:15 INFO hbase.PerformanceEvaluation: client-1 Start
> >> randomWrite at offset 1048576 for 1048576 rows
> >> 09/08/13 09:51:15 INFO hbase.PerformanceEvaluation: client-2 Start
> >> randomWrite at offset 2097152 for 1048576 rows
> >> 09/08/13 09:51:47 INFO hbase.PerformanceEvaluation: client-0
> >> 0/104857/1048576
> >> 09/08/13 09:51:48 INFO hbase.PerformanceEvaluation: client-1
> >> 1048576/1153427/2097152
> >> 09/08/13 09:51:48 INFO hbase.PerformanceEvaluation: client-2
> >> 2097152/2201997/3145728
> >> 09/08/13 09:52:22 INFO hbase.PerformanceEvaluation: client-1
> >> 1048576/1258284/2097152
> >> 09/08/13 09:52:23 INFO hbase.PerformanceEvaluation: client-0
> >> 0/209714/1048576
> >> 09/08/13 09:52:24 INFO hbase.PerformanceEvaluation: client-2
> >> 2097152/2306854/3145728
> >> 09/08/13 09:52:47 INFO hbase.PerformanceEvaluation: client-1
> >> 1048576/1363141/2097152
> >> 09/08/13 09:52:58 INFO hbase.PerformanceEvaluation: client-0
> >> 0/314571/1048576
> >> 09/08/13 09:52:58 INFO hbase.PerformanceEvaluation: client-2
> >> 2097152/2411711/3145728
> >> 09/08/13 09:53:24 INFO hbase.PerformanceEvaluation: client-1
> >> 1048576/1467998/2097152
> >> 09/08/13 09:53:27 INFO hbase.PerformanceEvaluation: client-0
> >> 0/419428/1048576
> >> 09/08/13 09:53:27 INFO hbase.PerformanceEvaluation: client-2
> >> 2097152/2516568/3145728
> >> 09/08/13 09:53:48 INFO hbase.PerformanceEvaluation: client-1
> >> 1048576/1572855/2097152
> >> 09/08/13 09:54:08 INFO hbase.PerformanceEvaluation: client-2
> >> 2097152/2621425/3145728
> >> 09/08/13 09:54:10 INFO hbase.PerformanceEvaluation: client-0
> >> 0/524285/1048576
> >> 09/08/13 09:54:40 INFO hbase.PerformanceEvaluation: client-1
> >> 1048576/1677712/2097152
> >> 09/08/13 09:54:49 INFO hbase.PerformanceEvaluation: client-2
> >> 2097152/2726282/3145728
> >> 09/08/13 09:54:52 INFO hbase.PerformanceEvaluation: client-0
> >> 0/629142/1048576
> >> 09/08/13 09:55:57 INFO hbase.PerformanceEvaluation: client-1
> >> 1048576/1782569/2097152
> >> 09/08/13 09:56:21 INFO hbase.PerformanceEvaluation: client-2
> >> 2097152/2831139/3145728
> >> 09/08/13 09:56:41 INFO hbase.PerformanceEvaluation: client-0
> >> 0/733999/1048576
> >> 09/08/13 09:57:23 INFO hbase.PerformanceEvaluation: client-1
> >> 1048576/1887426/2097152
> >> 09/08/13 09:58:40 INFO hbase.PerformanceEvaluation: client-2
> >> 2097152/2935996/3145728
> >> 09/08/13 09:58:54 INFO hbase.PerformanceEvaluation: client-0
> >> 0/838856/1048576
> >> 09/08/13 10:00:29 INFO hbase.PerformanceEvaluation: client-1
> >> 1048576/1992283/2097152
> >> 09/08/13 10:01:01 INFO hbase.PerformanceEvaluation: client-2
> >> 2097152/3040853/3145728
> >> 09/08/13 10:01:24 INFO hbase.PerformanceEvaluation: client-0
> >> 0/943713/1048576
> >> 09/08/13 10:02:36 INFO hbase.PerformanceEvaluation: client-1
> >> 1048576/2097140/2097152
> >> 09/08/13 10:02:37 INFO hbase.PerformanceEvaluation: client-1 Finished
> >> randomWrite in 680674ms at offset 1048576 for 1048576 rows
> >> 09/08/13 10:02:37 INFO hbase.PerformanceEvaluation: Finished 1 in
> 680674ms
> >> writing 1048576 rows
> >> 09/08/13 10:03:19 INFO hbase.PerformanceEvaluation: client-2
> >> 2097152/3145710/3145728
> >> 09/08/13 10:03:20 INFO hbase.PerformanceEvaluation: client-2 Finished
> >> randomWrite in 723771ms at offset 2097152 for 1048576 rows
> >> 09/08/13 10:03:20 INFO hbase.PerformanceEvaluation: Finished 2 in
> 723771ms
> >> writing 1048576 rows
> >> 09/08/13 10:03:41 INFO hbase.PerformanceEvaluation: client-0
> >> 0/1048570/1048576
> >> 09/08/13 10:03:42 INFO hbase.PerformanceEvaluation: client-0 Finished
> >> randomWrite in 746054ms at offset 0 for 1048576 rows
> >> 09/08/13 10:03:42 INFO hbase.PerformanceEvaluation: Finished 0 in
> 746054ms
> >> writing 1048576 rows
> >>
> >>
> >>
> >> ============================================================
> >>
> >> Still pretty slow.  Any other ideas?  I'm running the client from the
> >> master
> >> box, but its not running any regionServers or datanodes.
> >>
> >> stack-3 wrote:
> >> >
> >> > Your config. looks fine.
> >> >
> >> > Only think that gives me pause is:
> >> >
> >> > "-XX:NewSize=6m -XX:MaxNewSize=6m"
> >> >
> >> > Any reason for the above?
> >> >
> >> > If you study your GC logs, lots of pauses?
> >> >
> >> > Oh, and this: replication is set to 6.  Why 6?  Each write must commit
> >> to
> >> > 6
> >> > datanodes before complete.  In the tests posted on wiki, we replicate
> to
> >> 3
> >> > nodes.
> >> >
> >> > In end of this message you say you are doing gets?  Numbers you posted
> >> > were
> >> > for writes?
> >> >
> >> > St.Ack
> >> >
> >> >
> >> > On Wed, Aug 12, 2009 at 1:15 PM, llpind <sonny_heer@hotmail.com>
> wrote:
> >> >
> >> >>
> >> >> Not sure why my performance is so slow.  Here is my configuration:
> >> >>
> >> >> box1:
> >> >> 10395 SecondaryNameNode
> >> >> 11628 Jps
> >> >> 10131 NameNode
> >> >> 10638 HQuorumPeer
> >> >> 10705 HMaster
> >> >>
> >> >> box 2-5:
> >> >> 6741 HQuorumPeer
> >> >> 6841 HRegionServer
> >> >> 7881 Jps
> >> >> 6610 DataNode
> >> >>
> >> >>
> >> >> hbase site: =======================
> >> >> <?xml version="1.0"?>
> >> >> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> >> >> <!--
> >> >> /**
> >> >>  * Copyright 2007 The Apache Software Foundation
> >> >>  *
> >> >>  * Licensed to the Apache Software Foundation (ASF) under one
> >> >>  * or more contributor license agreements.  See the NOTICE file
> >> >>  * distributed with this work for additional information
> >> >>  * regarding copyright ownership.  The ASF licenses this file
> >> >>  * to you under the Apache License, Version 2.0 (the
> >> >>  * "License"); you may not use this file except in compliance
> >> >>  * with the License.  You may obtain a copy of the License at
> >> >>  *
> >> >>  *     http://www.apache.org/licenses/LICENSE-2.0
> >> >>  *
> >> >>  * Unless required by applicable law or agreed to in writing,
> software
> >> >>  * distributed under the License is distributed on an "AS IS" BASIS,
> >> >>  * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
> >> >> implied.
> >> >>  * See the License for the specific language governing permissions
> and
> >> >>  * limitations under the License.
> >> >>  */
> >> >> -->
> >> >> <configuration>
> >> >>  <property>
> >> >>    <name>hbase.rootdir</name>
> >> >>    <value>hdfs://box1:9000/hbase</value>
> >> >>    <description>The directory shared by region servers.
> >> >>    </description>
> >> >>  </property>
> >> >>  <property>
> >> >>    <name>hbase.master.port</name>
> >> >>    <value>60000</value>
> >> >>    <description>The port that the HBase master runs at.
> >> >>    </description>
> >> >>  </property>
> >> >>  <property>
> >> >>    <name>hbase.cluster.distributed</name>
> >> >>    <value>true</value>
> >> >>    <description>The mode the cluster will be in. Possible values
are
> >> >>      false: standalone and pseudo-distributed setups with managed
> >> >> Zookeeper
> >> >>      true: fully-distributed with unmanaged Zookeeper Quorum (see
> >> >> hbase-env.sh)
> >> >>    </description>
> >> >>  </property>
> >> >>  <property>
> >> >>    <name>hbase.regionserver.lease.period</name>
> >> >>    <value>120000</value>
> >> >>    <description>HRegion server lease period in milliseconds.
Default
> is
> >> >>    60 seconds. Clients must report in within this period else they
> are
> >> >>    considered dead.</description>
> >> >>  </property>
> >> >>
> >> >>  <property>
> >> >>      <name>hbase.zookeeper.property.clientPort</name>
> >> >>      <value>2222</value>
> >> >>      <description>Property from ZooKeeper's config zoo.cfg.
> >> >>      The port at which the clients will connect.
> >> >>      </description>
> >> >>  </property>
> >> >>  <property>
> >> >>      <name>hbase.zookeeper.property.dataDir</name>
> >> >>      <value>/home/hadoop/zookeeper</value>
> >> >>  </property>
> >> >>  <property>
> >> >>      <name>hbase.zookeeper.property.syncLimit</name>
> >> >>      <value>5</value>
> >> >>  </property>
> >> >>  <property>
> >> >>      <name>hbase.zookeeper.property.tickTime</name>
> >> >>      <value>2000</value>
> >> >>  </property>
> >> >>  <property>
> >> >>      <name>hbase.zookeeper.property.initLimit</name>
> >> >>      <value>10</value>
> >> >>  </property>
> >> >>  <property>
> >> >>      <name>hbase.zookeeper.quorum</name>
> >> >>      <value>box1,box2,box3,box4</value>
> >> >>      <description>Comma separated list of servers in the ZooKeeper
> >> >> Quorum.
> >> >>      For example,
> >> >> "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com".
> >> >>      By default this is set to localhost for local and
> >> pseudo-distributed
> >> >> modes
> >> >>      of operation. For a fully-distributed setup, this should be set
> to
> >> a
> >> >> full
> >> >>      list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in
> >> >> hbase-env.sh
> >> >>      this is the list of servers which we will start/stop ZooKeeper
> on.
> >> >>      </description>
> >> >>  </property>
> >> >>  <property>
> >> >>    <name>hfile.block.cache.size</name>
> >> >>    <value>.5</value>
> >> >>    <description>text</description>
> >> >>  </property>
> >> >>
> >> >> </configuration>
> >> >>
> >> >>
> >> >> hbase env:====================================================
> >> >>
> >> >> export HBASE_CLASSPATH=${HADOOP_CONF_DIR}
> >> >>
> >> >> export HBASE_HEAPSIZE=3000
> >> >>
> >> >> export HBASE_OPTS="-XX:NewSize=6m -XX:MaxNewSize=6m
> >> >> -XX:+UseConcMarkSweepGC
> >> >> -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
> >> >> -XX:+CMSIncrementalMode
> >> >> -Xloggc:/home/hadoop/hbase-0.20.0/logs/gc-hbase.log"
> >> >>
> >> >> export HBASE_MANAGES_ZK=true
> >> >>
> >> >> Hadoop core
> >> >> site===========================================================
> >> >>
> >> >> <?xml version="1.0"?>
> >> >> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> >> >>
> >> >> <!-- Put site-specific property overrides in this file. -->
> >> >>
> >> >> <configuration>
> >> >> <property>
> >> >>   <name>fs.default.name</name>
> >> >>   <value>hdfs://box1:9000</value>
> >> >>   <description>The name of the default file system.  A URI whose
> >> >>   scheme and authority determine the FileSystem implementation.  The
> >> >>   uri's scheme determines the config property (fs.SCHEME.impl) naming
> >> >>   the FileSystem implementation class.  The uri's authority is used
> to
> >> >>   determine the host, port, etc. for a filesystem.</description>
> >> >> </property>
> >> >> <property>
> >> >>  <name>hadoop.tmp.dir</name>
> >> >>  <value>/data/hadoop-0.20.0-${user.name}</value>
> >> >>  <description>A base for other temporary directories.</description>
> >> >> </property>
> >> >> </configuration>
> >> >>
> >> >> ==============
> >> >>
> >> >> replication is set to 6.
> >> >>
> >> >> hadoop env=================
> >> >>
> >> >> export HADOOP_HEAPSIZE=3000
> >> >> export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote
> >> >> $HADOOP_NAMENODE_OPTS"
> >> >> export HADOOP_SECONDARYNAMENODE_OPTS="-Dcom.sun.management.jmxremote
> >> >> $HADOOP_SECONDARYNAMENODE_OPTS"
> >> >> export HADOOP_DATANODE_OPTS="-Dcom.sun.management.jmxremote
> >> >> $HADOOP_DATANODE_OPTS"
> >> >> export HADOOP_BALANCER_OPTS="-Dcom.sun.management.jmxremote
> >> >> $HADOOP_BALANCER_OPTS"
> >> >> export HADOOP_JOBTRACKER_OPTS="-Dcom.sun.management.jmxremote
> >> >> $HADOOP_JOBTRACKER_OPTS"
> >> >>  ==================
> >> >>
> >> >>
> >> >> Very basic setup.  then i start the cluster do simple random Get
> >> >> operations
> >> >> on a tall table (~60 M rows):
> >> >>
> >> >> {NAME => 'tallTable', FAMILIES => [{NAME => 'family1', COMPRESSION
=>
> >> >> 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE =>
'65536',
> >> >> IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}
> >> >>
> >> >> Is this fairly normal speeds?  I'm unsure if this is a result of
> having
> >> a
> >> >> small cluster?  Please advise...
> >> >>
> >> >> stack-3 wrote:
> >> >> >
> >> >> > Yeah, seems slow.  In old hbase, it could do 5-10k writes a second
> >> >> going
> >> >> > by
> >> >> > performance eval page up on wiki.  SequentialWrite was about same
> as
> >> >> > RandomWrite.  Check out the stats on hw up on that page and
> >> description
> >> >> of
> >> >> > how test was set up.  Can you figure where its slow?
> >> >> >
> >> >> > St.Ack
> >> >> >
> >> >> > On Wed, Aug 12, 2009 at 10:10 AM, llpind <sonny_heer@hotmail.com>
> >> >> wrote:
> >> >> >
> >> >> >>
> >> >> >> Thanks Stack.
> >> >> >>
> >> >> >> I will try mapred with more clients.   I tried it without
mapred
> >> using
> >> >> 3
> >> >> >> clients Random Write operations here was the output:
> >> >> >>
> >> >> >> 09/08/12 09:22:52 INFO hbase.PerformanceEvaluation: client-0
Start
> >> >> >> randomWrite at offset 0 for 1048576 rows
> >> >> >> 09/08/12 09:22:52 INFO hbase.PerformanceEvaluation: client-1
Start
> >> >> >> randomWrite at offset 1048576 for 1048576 rows
> >> >> >> 09/08/12 09:22:52 INFO hbase.PerformanceEvaluation: client-2
Start
> >> >> >> randomWrite at offset 2097152 for 1048576 rows
> >> >> >> 09/08/12 09:24:23 INFO hbase.PerformanceEvaluation: client-1
> >> >> >> 1048576/1153427/2097152
> >> >> >> 09/08/12 09:24:23 INFO hbase.PerformanceEvaluation: client-2
> >> >> >> 2097152/2201997/3145728
> >> >> >> 09/08/12 09:24:25 INFO hbase.PerformanceEvaluation: client-0
> >> >> >> 0/104857/1048576
> >> >> >> 09/08/12 09:27:42 INFO hbase.PerformanceEvaluation: client-0
> >> >> >> 0/209714/1048576
> >> >> >> 09/08/12 09:27:46 INFO hbase.PerformanceEvaluation: client-1
> >> >> >> 1048576/1258284/2097152
> >> >> >> 09/08/12 09:27:46 INFO hbase.PerformanceEvaluation: client-2
> >> >> >> 2097152/2306854/3145728
> >> >> >> 09/08/12 09:32:32 INFO hbase.PerformanceEvaluation: client-1
> >> >> >> 1048576/1363141/2097152
> >> >> >> 09/08/12 09:32:33 INFO hbase.PerformanceEvaluation: client-0
> >> >> >> 0/314571/1048576
> >> >> >> 09/08/12 09:32:41 INFO hbase.PerformanceEvaluation: client-2
> >> >> >> 2097152/2411711/3145728
> >> >> >> 09/08/12 09:35:31 INFO hbase.PerformanceEvaluation: client-0
> >> >> >> 0/419428/1048576
> >> >> >> 09/08/12 09:35:34 INFO hbase.PerformanceEvaluation: client-1
> >> >> >> 1048576/1467998/2097152
> >> >> >> 09/08/12 09:35:53 INFO hbase.PerformanceEvaluation: client-2
> >> >> >> 2097152/2516568/3145728
> >> >> >> 09/08/12 09:39:02 INFO hbase.PerformanceEvaluation: client-0
> >> >> >> 0/524285/1048576
> >> >> >> 09/08/12 09:39:03 INFO hbase.PerformanceEvaluation: client-2
> >> >> >> 2097152/2621425/3145728
> >> >> >> 09/08/12 09:40:07 INFO hbase.PerformanceEvaluation: client-1
> >> >> >> 1048576/1572855/2097152
> >> >> >> 09/08/12 09:42:53 INFO hbase.PerformanceEvaluation: client-0
> >> >> >> 0/629142/1048576
> >> >> >> 09/08/12 09:44:25 INFO hbase.PerformanceEvaluation: client-2
> >> >> >> 2097152/2726282/3145728
> >> >> >> 09/08/12 09:44:44 INFO hbase.PerformanceEvaluation: client-1
> >> >> >> 1048576/1677712/2097152
> >> >> >> 09/08/12 09:46:43 INFO hbase.PerformanceEvaluation: client-0
> >> >> >> 0/733999/1048576
> >> >> >> 09/08/12 09:48:11 INFO hbase.PerformanceEvaluation: client-2
> >> >> >> 2097152/2831139/3145728
> >> >> >> 09/08/12 09:48:29 INFO hbase.PerformanceEvaluation: client-1
> >> >> >> 1048576/1782569/2097152
> >> >> >> 09/08/12 09:50:12 INFO hbase.PerformanceEvaluation: client-0
> >> >> >> 0/838856/1048576
> >> >> >> 09/08/12 09:52:47 INFO hbase.PerformanceEvaluation: client-2
> >> >> >> 2097152/2935996/3145728
> >> >> >> 09/08/12 09:53:51 INFO hbase.PerformanceEvaluation: client-1
> >> >> >> 1048576/1887426/2097152
> >> >> >> 09/08/12 09:56:32 INFO hbase.PerformanceEvaluation: client-0
> >> >> >> 0/943713/1048576
> >> >> >> 09/08/12 09:58:32 INFO hbase.PerformanceEvaluation: client-2
> >> >> >> 2097152/3040853/3145728
> >> >> >> 09/08/12 09:59:14 INFO hbase.PerformanceEvaluation: client-1
> >> >> >> 1048576/1992283/2097152
> >> >> >> 09/08/12 10:02:28 INFO hbase.PerformanceEvaluation: client-0
> >> >> >> 0/1048570/1048576
> >> >> >> 09/08/12 10:02:30 INFO hbase.PerformanceEvaluation: client-0
> >> Finished
> >> >> >> randomWrite in 2376615ms at offset 0 for 1048576 rows
> >> >> >> 09/08/12 10:02:30 INFO hbase.PerformanceEvaluation: Finished
0 in
> >> >> >> 2376615ms
> >> >> >> writing 1048576 rows
> >> >> >> 09/08/12 10:06:35 INFO hbase.PerformanceEvaluation: client-2
> >> >> >> 2097152/3145710/3145728
> >> >> >> 09/08/12 10:06:38 INFO hbase.PerformanceEvaluation: client-2
> >> Finished
> >> >> >> randomWrite in 2623395ms at offset 2097152 for 1048576 rows
> >> >> >> 09/08/12 10:06:38 INFO hbase.PerformanceEvaluation: Finished
2 in
> >> >> >> 2623395ms
> >> >> >> writing 1048576 rows
> >> >> >> 09/08/12 10:06:42 INFO hbase.PerformanceEvaluation: client-1
> >> >> >> 1048576/2097140/2097152
> >> >> >> 09/08/12 10:06:43 INFO hbase.PerformanceEvaluation: client-1
> >> Finished
> >> >> >> randomWrite in 2630199ms at offset 1048576 for 1048576 rows
> >> >> >> 09/08/12 10:06:43 INFO hbase.PerformanceEvaluation: Finished
1 in
> >> >> >> 2630199ms
> >> >> >> writing 1048576 rows
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> Seems kind of slow for ~3M records.  I have a 4 node cluster
up at
> >> the
> >> >> >> moment.  HMaster & Namenode running on same box.
> >> >> >> --
> >> >> >> View this message in context:
> >> >> >>
> >> >>
> >>
> http://www.nabble.com/HBase-in-a-real-world-application-tp24920888p24940922.html
> >> >> >> Sent from the HBase User mailing list archive at Nabble.com.
> >> >> >>
> >> >> >>
> >> >> >
> >> >> >
> >> >>
> >> >> --
> >> >> View this message in context:
> >> >>
> >>
> http://www.nabble.com/HBase-in-a-real-world-application-tp24920888p24943406.html
> >> >> Sent from the HBase User mailing list archive at Nabble.com.
> >> >>
> >> >>
> >> >
> >> >
> >>
> >> --
> >> View this message in context:
> >>
> http://www.nabble.com/HBase-in-a-real-world-application-tp24920888p24955595.html
> >> Sent from the HBase User mailing list archive at Nabble.com.
> >>
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message