Mailing-List: contact user-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hbase.apache.org
Received-SPF: pass (nike.apache.org: domain of
 ramkrishna.s.vasudevan@gmail.com designates 209.85.128.51 as permitted
 sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAAT7MkpH6JM-781erP9s3_cyTwABBs3uDK=h8pdsXogMVU8gPQ@mail.gmail.com>
References: <992ED057-7C3F-4759-B1F4-5F166D549F18@gmail.com>
	<CALte62yHLjUEmdpO6j+ehcUqc=Pqo2iOZc_q2A6iMhkZopooYQ@mail.gmail.com>
	<FDE42D5E-FCA5-4543-B178-0737FBC623AD@gmail.com>
	<1367384494.5120.YahooMailNeo@web140601.mail.bf1.yahoo.com>
	<20E5E82A-4A5F-4696-864B-E30C3B7B97CB@gmail.com>
	<1367389307.94182.YahooMailNeo@web140602.mail.bf1.yahoo.com>
	<CAOKsKJVJZ3t88hAt4m0y5dwfBGb7HavYj7i6Jf6QL363nzCG-g@mail.gmail.com>
	<CAGhc1UNeSJq69cbsnCuhk8YVDf1uasSHC1U+oz8gdEo40FAUnQ@mail.gmail.com>
	<CAAT7MkpH6JM-781erP9s3_cyTwABBs3uDK=h8pdsXogMVU8gPQ@mail.gmail.com>
Date: Wed, 1 May 2013 12:59:42 +0530
Message-ID: 
 <CAAT7Mkpt+nRYn46i9Jhizy8qfUVG+04LyxmkcyK+yE+bO_=rXA@mail.gmail.com>
Subject: Re: Poor HBase map-reduce scan performance
From: ramkrishna vasudevan <ramkrishna.s.vasudevan@gmail.com>
To: user@hbase.apache.org
Cc: lars hofhansl <larsh@apache.org>
Content-Type: multipart/alternative; boundary=f46d0447f0c0d2c56804dba314d4

--f46d0447f0c0d2c56804dba314d4
Content-Type: text/plain; charset=ISO-8859-1

Sorry.  I think someone hijacked this thread and I replied to this.
Naidu,
Request you to post a new thread if you have queries and do not hijack the
thread.

Regards
Ram


On Wed, May 1, 2013 at 12:57 PM, ramkrishna vasudevan <
ramkrishna.s.vasudevan@gmail.com> wrote:

> This happens when your java process is running in debug mode and
> suspend='Y' option is selected.
>
> Regards
> Ram
>
>
> On Wed, May 1, 2013 at 12:55 PM, Naidu MS <sanyasinaidu.malla433@gmail.com
> > wrote:
>
>> Hi i have two questions regarding hdfs and jps utility
>>
>> I am new to Hadoop and started leraning hadoop from the past week
>>
>> 1.when ever i start start-all.sh and jps in console it showing the
>> processes started
>>
>> *naidu@naidu:~/work/hadoop-1.0.4/bin$ jps*
>> *22283 NameNode*
>> *23516 TaskTracker*
>> *26711 Jps*
>> *22541 DataNode*
>> *23255 JobTracker*
>> *22813 SecondaryNameNode*
>> *Could not synchronize with target*
>>
>> But along with the list of process stared it always showing *" Could not
>> synchronize with target" *in the jps output. What is meant by "Could not
>> synchronize with target"?  Can some one explain why this is happening?
>>
>>
>> 2.Is it possible to format namenode multiple  times? When i enter the
>>  namenode -format command, it not formatting the name node and showing the
>> following ouput.
>>
>> *naidu@naidu:~/work/hadoop-1.0.4/bin$ hadoop namenode -format*
>> *Warning: $HADOOP_HOME is deprecated.*
>> *
>> *
>> *13/05/01 12:08:04 INFO namenode.NameNode: STARTUP_MSG: *
>> */*************************************************************
>> *STARTUP_MSG: Starting NameNode*
>> *STARTUP_MSG:   host = naidu/127.0.0.1*
>> *STARTUP_MSG:   args = [-format]*
>> *STARTUP_MSG:   version = 1.0.4*
>> *STARTUP_MSG:   build =
>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r
>> 1393290; compiled by 'hortonfo' on Wed Oct  3 05:13:58 UTC 2012*
>> *************************************************************/*
>> *Re-format filesystem in /home/naidu/dfs/namenode ? (Y or N) y*
>> *Format aborted in /home/naidu/dfs/namenode*
>> *13/05/01 12:08:05 INFO namenode.NameNode: SHUTDOWN_MSG: *
>> */*************************************************************
>> *SHUTDOWN_MSG: Shutting down NameNode at naidu/127.0.0.1*
>> *
>> *
>> *************************************************************/*
>>
>> Can someone help me in understanding this? Why is it not possible to
>> format
>> name node multiple times?
>>
>>
>> On Wed, May 1, 2013 at 12:22 PM, Matt Corgan <mcorgan@hotpads.com> wrote:
>>
>> > Not that it's a long-term solution, but try major-compacting before
>> running
>> > the benchmark.  If the LSM tree is CPU bound in merging HFiles/KeyValues
>> > through the PriorityQueue, then reducing to a single file per region
>> should
>> > help.  The merging of HFiles during a scan is not heavily optimized yet.
>> >
>> >
>> > On Tue, Apr 30, 2013 at 11:21 PM, lars hofhansl <larsh@apache.org>
>> wrote:
>> >
>> > > If you can, try 0.94.4+; it should significantly reduce the amount of
>> > > bytes copied around in RAM during scanning, especially if you have
>> wide
>> > > rows and/or large key portions. That in turns makes scans scale better
>> > > across cores, since RAM is shared resource between cores (much like
>> > disk).
>> > >
>> > >
>> > > It's not hard to build the latest HBase against Cloudera's version of
>> > > Hadoop. I can send along a simple patch to pom.xml to do that.
>> > >
>> > > -- Lars
>> > >
>> > >
>> > >
>> > > ________________________________
>> > >  From: Bryan Keller <bryanck@gmail.com>
>> > > To: user@hbase.apache.org
>> > > Sent: Tuesday, April 30, 2013 11:02 PM
>> > > Subject: Re: Poor HBase map-reduce scan performance
>> > >
>> > >
>> > > The table has hashed keys so rows are evenly distributed amongst the
>> > > regionservers, and load on each regionserver is pretty much the same.
>> I
>> > > also have per-table balancing turned on. I get mostly data local
>> mappers
>> > > with only a few rack local (maybe 10 of the 250 mappers).
>> > >
>> > > Currently the table is a wide table schema, with lists of data
>> structures
>> > > stored as columns with column prefixes grouping the data structures
>> (e.g.
>> > > 1_name, 1_address, 1_city, 2_name, 2_address, 2_city). I was thinking
>> of
>> > > moving those data structures to protobuf which would cut down on the
>> > number
>> > > of columns. The downside is I can't filter on one value with that,
>> but it
>> > > is a tradeoff I would make for performance. I was also considering
>> > > restructuring the table into a tall table.
>> > >
>> > > Something interesting is that my old regionserver machines had five
>> 15k
>> > > SCSI drives instead of 2 SSDs, and performance was about the same.
>> Also,
>> > my
>> > > old network was 1gbit, now it is 10gbit. So neither network nor disk
>> I/O
>> > > appear to be the bottleneck. The CPU is rather high for the
>> regionserver
>> > so
>> > > it seems like the best candidate to investigate. I will try profiling
>> it
>> > > tomorrow and will report back. I may revisit compression on vs off
>> since
>> > > that is adding load to the CPU.
>> > >
>> > > I'll also come up with a sample program that generates data similar
>> to my
>> > > table.
>> > >
>> > >
>> > > On Apr 30, 2013, at 10:01 PM, lars hofhansl <larsh@apache.org> wrote:
>> > >
>> > > > Your average row is 35k so scanner caching would not make a huge
>> > > difference, although I would have expected some improvements by
>> setting
>> > it
>> > > to 10 or 50 since you have a wide 10ge pipe.
>> > > >
>> > > > I assume your table is split sufficiently to touch all
>> RegionServer...
>> > > Do you see the same load/IO on all region servers?
>> > > >
>> > > > A bunch of scan improvements went into HBase since 0.94.2.
>> > > > I blogged about some of these changes here:
>> > > http://hadoop-hbase.blogspot.com/2012/12/hbase-profiling.html
>> > > >
>> > > > In your case - since you have many columns, each of which carry the
>> > > rowkey - you might benefit a lot from HBASE-7279.
>> > > >
>> > > > In the end HBase *is* slower than straight HDFS for full scans. How
>> > > could it not be?
>> > > > So I would start by looking at HDFS first. Make sure Nagle's is
>> > disbaled
>> > > in both HBase and HDFS.
>> > > >
>> > > > And lastly SSDs are somewhat new territory for HBase. Maybe Andy
>> > Purtell
>> > > is listening, I think he did some tests with HBase on SSDs.
>> > > > With rotating media you typically see an improvement with
>> compression.
>> > > With SSDs the added CPU needed for decompression might outweigh the
>> > > benefits.
>> > > >
>> > > > At the risk of starting a larger discussion here, I would posit that
>> > > HBase's LSM based design, which trades random IO with sequential IO,
>> > might
>> > > be a bit more questionable on SSDs.
>> > > >
>> > > > If you can, it would be nice to run a profiler against one of the
>> > > RegionServers (or maybe do it with the single RS setup) and see where
>> it
>> > is
>> > > bottlenecked.
>> > > > (And if you send me a sample program to generate some data - not
>> 700g,
>> > > though :) - I'll try to do a bit of profiling during the next days as
>> my
>> > > day job permits, but I do not have any machines with SSDs).
>> > > >
>> > > > -- Lars
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > ________________________________
>> > > > From: Bryan Keller <bryanck@gmail.com>
>> > > > To: user@hbase.apache.org
>> > > > Sent: Tuesday, April 30, 2013 9:31 PM
>> > > > Subject: Re: Poor HBase map-reduce scan performance
>> > > >
>> > > >
>> > > > Yes, I have tried various settings for setCaching() and I have
>> > > setCacheBlocks(false)
>> > > >
>> > > > On Apr 30, 2013, at 9:17 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>> > > >
>> > > >> From http://hbase.apache.org/book.html#mapreduce.example :
>> > > >>
>> > > >> scan.setCaching(500);        // 1 is the default in Scan, which
>> will
>> > > >> be bad for MapReduce jobs
>> > > >> scan.setCacheBlocks(false);  // don't set to true for MR jobs
>> > > >>
>> > > >> I guess you have used the above setting.
>> > > >>
>> > > >> 0.94.x releases are compatible. Have you considered upgrading to,
>> say
>> > > >> 0.94.7 which was recently released ?
>> > > >>
>> > > >> Cheers
>> > > >>
>> > > >> On Tue, Apr 30, 2013 at 9:01 PM, Bryan Keller <bryanck@gmail.com>
>> > > wrote:
>> > > >>
>> > > >>> I have been attempting to speed up my HBase map-reduce scans for a
>> > > while
>> > > >>> now. I have tried just about everything without much luck. I'm
>> > running
>> > > out
>> > > >>> of ideas and was hoping for some suggestions. This is HBase 0.94.2
>> > and
>> > > >>> Hadoop 2.0.0 (CDH4.2.1).
>> > > >>>
>> > > >>> The table I'm scanning:
>> > > >>> 20 mil rows
>> > > >>> Hundreds of columns/row
>> > > >>> Column keys can be 30-40 bytes
>> > > >>> Column values are generally not large, 1k would be on the large
>> side
>> > > >>> 250 regions
>> > > >>> Snappy compression
>> > > >>> 8gb region size
>> > > >>> 512mb memstore flush
>> > > >>> 128k block size
>> > > >>> 700gb of data on HDFS
>> > > >>>
>> > > >>> My cluster has 8 datanodes which are also regionservers. Each has
>> 8
>> > > cores
>> > > >>> (16 HT), 64gb RAM, and 2 SSDs. The network is 10gbit. I have a
>> > separate
>> > > >>> machine acting as namenode, HMaster, and zookeeper (single
>> > instance). I
>> > > >>> have disk local reads turned on.
>> > > >>>
>> > > >>> I'm seeing around 5 gbit/sec on average network IO. Each disk is
>> > > getting
>> > > >>> 400mb/sec read IO. Theoretically I could get 400mb/sec * 16 =
>> > > 6.4gb/sec.
>> > > >>>
>> > > >>> Using Hadoop's TestDFSIO tool, I'm seeing around 1.4gb/sec read
>> > speed.
>> > > Not
>> > > >>> really that great compared to the theoretical I/O. However this is
>> > far
>> > > >>> better than I am seeing with HBase map-reduce scans of my table.
>> > > >>>
>> > > >>> I have a simple no-op map-only job (using TableInputFormat) that
>> > scans
>> > > the
>> > > >>> table and does nothing with data. This takes 45 minutes. That's
>> about
>> > > >>> 260mb/sec read speed. This is over 5x slower than straight HDFS.
>> > > >>> Basically, with HBase I'm seeing read performance of my 16 SSD
>> > cluster
>> > > >>> performing nearly 35% slower than a single SSD.
>> > > >>>
>> > > >>> Here are some things I have changed to no avail:
>> > > >>> Scan caching values
>> > > >>> HDFS block sizes
>> > > >>> HBase block sizes
>> > > >>> Region file sizes
>> > > >>> Memory settings
>> > > >>> GC settings
>> > > >>> Number of mappers/node
>> > > >>> Compressed vs not compressed
>> > > >>>
>> > > >>> One thing I notice is that the regionserver is using quite a bit
>> of
>> > CPU
>> > > >>> during the map reduce job. When dumping the jstack of the
>> process, it
>> > > seems
>> > > >>> like it is usually in some type of memory allocation or
>> decompression
>> > > >>> routine which didn't seem abnormal.
>> > > >>>
>> > > >>> I can't seem to pinpoint the bottleneck. CPU use by the
>> regionserver
>> > is
>> > > >>> high but not maxed out. Disk I/O and network I/O are low, IO wait
>> is
>> > > low.
>> > > >>> I'm on the verge of just writing the dataset out to sequence files
>> > > once a
>> > > >>> day for scan purposes. Is that what others are doing?
>> > >
>> >
>>
>
>

--f46d0447f0c0d2c56804dba314d4--