Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A3208100B4 for ; Wed, 1 May 2013 07:30:12 +0000 (UTC) Received: (qmail 35787 invoked by uid 500); 1 May 2013 07:30:10 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 35479 invoked by uid 500); 1 May 2013 07:30:10 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 35453 invoked by uid 99); 1 May 2013 07:30:09 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 May 2013 07:30:09 +0000 X-ASF-Spam-Status: No, hits=2.5 required=5.0 tests=FREEMAIL_REPLY,HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ramkrishna.s.vasudevan@gmail.com designates 209.85.128.51 as permitted sender) Received: from [209.85.128.51] (HELO mail-qe0-f51.google.com) (209.85.128.51) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 May 2013 07:30:03 +0000 Received: by mail-qe0-f51.google.com with SMTP id cz11so765053qeb.38 for ; Wed, 01 May 2013 00:29:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=rVIlj6duEVVsUoD4QAIXzjq+Kw3KHdg5Z61RSCubutc=; b=ueCiyev2szf6WhweIrfTWaM/a7Bqi1bdU0dCI0AI5F/hgBT/x6hwuhon/+t7+4wv2H fHG88G2EJ0Jns7LVaAJIrxngs8b9fwZc0ECU+yLdLirzQiJ9QPDxQ9+rOXB5o+fLUoRQ 1/u1NC1i4hmBLygI47o4fYeziNSSlg6e0R0DycCVWpSlqWI8eugRvRJCfA4duYkrvoor o1VrnSgNyDsIy3uKa6XBLPde7nzDRhPZGJ+mAbL3vZTNK2G8v2Vyo1sC6Rhlrqea0atj qZOJOsvLLu7kMe7OK0UwgtCUP9csJeovDWIKGwzDdS7vi9RXxRkz5p8J40hdPSQO7jrN Zq0Q== MIME-Version: 1.0 X-Received: by 10.229.154.200 with SMTP id p8mr646801qcw.4.1367393382342; Wed, 01 May 2013 00:29:42 -0700 (PDT) Received: by 10.49.75.133 with HTTP; Wed, 1 May 2013 00:29:42 -0700 (PDT) In-Reply-To: References: <992ED057-7C3F-4759-B1F4-5F166D549F18@gmail.com> <1367384494.5120.YahooMailNeo@web140601.mail.bf1.yahoo.com> <20E5E82A-4A5F-4696-864B-E30C3B7B97CB@gmail.com> <1367389307.94182.YahooMailNeo@web140602.mail.bf1.yahoo.com> Date: Wed, 1 May 2013 12:59:42 +0530 Message-ID: Subject: Re: Poor HBase map-reduce scan performance From: ramkrishna vasudevan To: user@hbase.apache.org Cc: lars hofhansl Content-Type: multipart/alternative; boundary=f46d0447f0c0d2c56804dba314d4 X-Virus-Checked: Checked by ClamAV on apache.org --f46d0447f0c0d2c56804dba314d4 Content-Type: text/plain; charset=ISO-8859-1 Sorry. I think someone hijacked this thread and I replied to this. Naidu, Request you to post a new thread if you have queries and do not hijack the thread. Regards Ram On Wed, May 1, 2013 at 12:57 PM, ramkrishna vasudevan < ramkrishna.s.vasudevan@gmail.com> wrote: > This happens when your java process is running in debug mode and > suspend='Y' option is selected. > > Regards > Ram > > > On Wed, May 1, 2013 at 12:55 PM, Naidu MS > wrote: > >> Hi i have two questions regarding hdfs and jps utility >> >> I am new to Hadoop and started leraning hadoop from the past week >> >> 1.when ever i start start-all.sh and jps in console it showing the >> processes started >> >> *naidu@naidu:~/work/hadoop-1.0.4/bin$ jps* >> *22283 NameNode* >> *23516 TaskTracker* >> *26711 Jps* >> *22541 DataNode* >> *23255 JobTracker* >> *22813 SecondaryNameNode* >> *Could not synchronize with target* >> >> But along with the list of process stared it always showing *" Could not >> synchronize with target" *in the jps output. What is meant by "Could not >> synchronize with target"? Can some one explain why this is happening? >> >> >> 2.Is it possible to format namenode multiple times? When i enter the >> namenode -format command, it not formatting the name node and showing the >> following ouput. >> >> *naidu@naidu:~/work/hadoop-1.0.4/bin$ hadoop namenode -format* >> *Warning: $HADOOP_HOME is deprecated.* >> * >> * >> *13/05/01 12:08:04 INFO namenode.NameNode: STARTUP_MSG: * >> */************************************************************* >> *STARTUP_MSG: Starting NameNode* >> *STARTUP_MSG: host = naidu/127.0.0.1* >> *STARTUP_MSG: args = [-format]* >> *STARTUP_MSG: version = 1.0.4* >> *STARTUP_MSG: build = >> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r >> 1393290; compiled by 'hortonfo' on Wed Oct 3 05:13:58 UTC 2012* >> *************************************************************/* >> *Re-format filesystem in /home/naidu/dfs/namenode ? (Y or N) y* >> *Format aborted in /home/naidu/dfs/namenode* >> *13/05/01 12:08:05 INFO namenode.NameNode: SHUTDOWN_MSG: * >> */************************************************************* >> *SHUTDOWN_MSG: Shutting down NameNode at naidu/127.0.0.1* >> * >> * >> *************************************************************/* >> >> Can someone help me in understanding this? Why is it not possible to >> format >> name node multiple times? >> >> >> On Wed, May 1, 2013 at 12:22 PM, Matt Corgan wrote: >> >> > Not that it's a long-term solution, but try major-compacting before >> running >> > the benchmark. If the LSM tree is CPU bound in merging HFiles/KeyValues >> > through the PriorityQueue, then reducing to a single file per region >> should >> > help. The merging of HFiles during a scan is not heavily optimized yet. >> > >> > >> > On Tue, Apr 30, 2013 at 11:21 PM, lars hofhansl >> wrote: >> > >> > > If you can, try 0.94.4+; it should significantly reduce the amount of >> > > bytes copied around in RAM during scanning, especially if you have >> wide >> > > rows and/or large key portions. That in turns makes scans scale better >> > > across cores, since RAM is shared resource between cores (much like >> > disk). >> > > >> > > >> > > It's not hard to build the latest HBase against Cloudera's version of >> > > Hadoop. I can send along a simple patch to pom.xml to do that. >> > > >> > > -- Lars >> > > >> > > >> > > >> > > ________________________________ >> > > From: Bryan Keller >> > > To: user@hbase.apache.org >> > > Sent: Tuesday, April 30, 2013 11:02 PM >> > > Subject: Re: Poor HBase map-reduce scan performance >> > > >> > > >> > > The table has hashed keys so rows are evenly distributed amongst the >> > > regionservers, and load on each regionserver is pretty much the same. >> I >> > > also have per-table balancing turned on. I get mostly data local >> mappers >> > > with only a few rack local (maybe 10 of the 250 mappers). >> > > >> > > Currently the table is a wide table schema, with lists of data >> structures >> > > stored as columns with column prefixes grouping the data structures >> (e.g. >> > > 1_name, 1_address, 1_city, 2_name, 2_address, 2_city). I was thinking >> of >> > > moving those data structures to protobuf which would cut down on the >> > number >> > > of columns. The downside is I can't filter on one value with that, >> but it >> > > is a tradeoff I would make for performance. I was also considering >> > > restructuring the table into a tall table. >> > > >> > > Something interesting is that my old regionserver machines had five >> 15k >> > > SCSI drives instead of 2 SSDs, and performance was about the same. >> Also, >> > my >> > > old network was 1gbit, now it is 10gbit. So neither network nor disk >> I/O >> > > appear to be the bottleneck. The CPU is rather high for the >> regionserver >> > so >> > > it seems like the best candidate to investigate. I will try profiling >> it >> > > tomorrow and will report back. I may revisit compression on vs off >> since >> > > that is adding load to the CPU. >> > > >> > > I'll also come up with a sample program that generates data similar >> to my >> > > table. >> > > >> > > >> > > On Apr 30, 2013, at 10:01 PM, lars hofhansl wrote: >> > > >> > > > Your average row is 35k so scanner caching would not make a huge >> > > difference, although I would have expected some improvements by >> setting >> > it >> > > to 10 or 50 since you have a wide 10ge pipe. >> > > > >> > > > I assume your table is split sufficiently to touch all >> RegionServer... >> > > Do you see the same load/IO on all region servers? >> > > > >> > > > A bunch of scan improvements went into HBase since 0.94.2. >> > > > I blogged about some of these changes here: >> > > http://hadoop-hbase.blogspot.com/2012/12/hbase-profiling.html >> > > > >> > > > In your case - since you have many columns, each of which carry the >> > > rowkey - you might benefit a lot from HBASE-7279. >> > > > >> > > > In the end HBase *is* slower than straight HDFS for full scans. How >> > > could it not be? >> > > > So I would start by looking at HDFS first. Make sure Nagle's is >> > disbaled >> > > in both HBase and HDFS. >> > > > >> > > > And lastly SSDs are somewhat new territory for HBase. Maybe Andy >> > Purtell >> > > is listening, I think he did some tests with HBase on SSDs. >> > > > With rotating media you typically see an improvement with >> compression. >> > > With SSDs the added CPU needed for decompression might outweigh the >> > > benefits. >> > > > >> > > > At the risk of starting a larger discussion here, I would posit that >> > > HBase's LSM based design, which trades random IO with sequential IO, >> > might >> > > be a bit more questionable on SSDs. >> > > > >> > > > If you can, it would be nice to run a profiler against one of the >> > > RegionServers (or maybe do it with the single RS setup) and see where >> it >> > is >> > > bottlenecked. >> > > > (And if you send me a sample program to generate some data - not >> 700g, >> > > though :) - I'll try to do a bit of profiling during the next days as >> my >> > > day job permits, but I do not have any machines with SSDs). >> > > > >> > > > -- Lars >> > > > >> > > > >> > > > >> > > > >> > > > ________________________________ >> > > > From: Bryan Keller >> > > > To: user@hbase.apache.org >> > > > Sent: Tuesday, April 30, 2013 9:31 PM >> > > > Subject: Re: Poor HBase map-reduce scan performance >> > > > >> > > > >> > > > Yes, I have tried various settings for setCaching() and I have >> > > setCacheBlocks(false) >> > > > >> > > > On Apr 30, 2013, at 9:17 PM, Ted Yu wrote: >> > > > >> > > >> From http://hbase.apache.org/book.html#mapreduce.example : >> > > >> >> > > >> scan.setCaching(500); // 1 is the default in Scan, which >> will >> > > >> be bad for MapReduce jobs >> > > >> scan.setCacheBlocks(false); // don't set to true for MR jobs >> > > >> >> > > >> I guess you have used the above setting. >> > > >> >> > > >> 0.94.x releases are compatible. Have you considered upgrading to, >> say >> > > >> 0.94.7 which was recently released ? >> > > >> >> > > >> Cheers >> > > >> >> > > >> On Tue, Apr 30, 2013 at 9:01 PM, Bryan Keller >> > > wrote: >> > > >> >> > > >>> I have been attempting to speed up my HBase map-reduce scans for a >> > > while >> > > >>> now. I have tried just about everything without much luck. I'm >> > running >> > > out >> > > >>> of ideas and was hoping for some suggestions. This is HBase 0.94.2 >> > and >> > > >>> Hadoop 2.0.0 (CDH4.2.1). >> > > >>> >> > > >>> The table I'm scanning: >> > > >>> 20 mil rows >> > > >>> Hundreds of columns/row >> > > >>> Column keys can be 30-40 bytes >> > > >>> Column values are generally not large, 1k would be on the large >> side >> > > >>> 250 regions >> > > >>> Snappy compression >> > > >>> 8gb region size >> > > >>> 512mb memstore flush >> > > >>> 128k block size >> > > >>> 700gb of data on HDFS >> > > >>> >> > > >>> My cluster has 8 datanodes which are also regionservers. Each has >> 8 >> > > cores >> > > >>> (16 HT), 64gb RAM, and 2 SSDs. The network is 10gbit. I have a >> > separate >> > > >>> machine acting as namenode, HMaster, and zookeeper (single >> > instance). I >> > > >>> have disk local reads turned on. >> > > >>> >> > > >>> I'm seeing around 5 gbit/sec on average network IO. Each disk is >> > > getting >> > > >>> 400mb/sec read IO. Theoretically I could get 400mb/sec * 16 = >> > > 6.4gb/sec. >> > > >>> >> > > >>> Using Hadoop's TestDFSIO tool, I'm seeing around 1.4gb/sec read >> > speed. >> > > Not >> > > >>> really that great compared to the theoretical I/O. However this is >> > far >> > > >>> better than I am seeing with HBase map-reduce scans of my table. >> > > >>> >> > > >>> I have a simple no-op map-only job (using TableInputFormat) that >> > scans >> > > the >> > > >>> table and does nothing with data. This takes 45 minutes. That's >> about >> > > >>> 260mb/sec read speed. This is over 5x slower than straight HDFS. >> > > >>> Basically, with HBase I'm seeing read performance of my 16 SSD >> > cluster >> > > >>> performing nearly 35% slower than a single SSD. >> > > >>> >> > > >>> Here are some things I have changed to no avail: >> > > >>> Scan caching values >> > > >>> HDFS block sizes >> > > >>> HBase block sizes >> > > >>> Region file sizes >> > > >>> Memory settings >> > > >>> GC settings >> > > >>> Number of mappers/node >> > > >>> Compressed vs not compressed >> > > >>> >> > > >>> One thing I notice is that the regionserver is using quite a bit >> of >> > CPU >> > > >>> during the map reduce job. When dumping the jstack of the >> process, it >> > > seems >> > > >>> like it is usually in some type of memory allocation or >> decompression >> > > >>> routine which didn't seem abnormal. >> > > >>> >> > > >>> I can't seem to pinpoint the bottleneck. CPU use by the >> regionserver >> > is >> > > >>> high but not maxed out. Disk I/O and network I/O are low, IO wait >> is >> > > low. >> > > >>> I'm on the verge of just writing the dataset out to sequence files >> > > once a >> > > >>> day for scan purposes. Is that what others are doing? >> > > >> > >> > > --f46d0447f0c0d2c56804dba314d4--