Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0B6CF6C16 for ; Fri, 15 Jul 2011 03:41:58 +0000 (UTC) Received: (qmail 3222 invoked by uid 500); 15 Jul 2011 03:41:56 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 2706 invoked by uid 500); 15 Jul 2011 03:41:46 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 2679 invoked by uid 99); 15 Jul 2011 03:41:35 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Jul 2011 03:41:35 +0000 X-ASF-Spam-Status: No, hits=1.6 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of koven2049@gmail.com designates 209.85.160.169 as permitted sender) Received: from [209.85.160.169] (HELO mail-gy0-f169.google.com) (209.85.160.169) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Jul 2011 03:41:27 +0000 Received: by gyg13 with SMTP id 13so486901gyg.14 for ; Thu, 14 Jul 2011 20:41:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=r2spgOzzhgWpF2WTHc9qEQh4ISRCqGtk2YCim3S1Vgc=; b=f97/excj9sOTMbD2F+dZkx7/zhf8Vc0Lpo9972X9XGk8ALFJfS7Ib67BQagKIxtWAF mhB2Hem8cdMRAH9LFwmBQjF8qANF4fpPjuWCJavdOmZdc0FpJHPICIl4nCGkh0f2tdqQ KnY3eeAiEpOlyBe16rHi5DcbmSuZ1pZU4L940= MIME-Version: 1.0 Received: by 10.151.61.12 with SMTP id o12mr2949431ybk.336.1310701266454; Thu, 14 Jul 2011 20:41:06 -0700 (PDT) Received: by 10.150.218.19 with HTTP; Thu, 14 Jul 2011 20:41:06 -0700 (PDT) In-Reply-To: References: Date: Fri, 15 Jul 2011 11:41:06 +0800 Message-ID: Subject: Re: performance problem during read From: Mingjian Deng To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=000325553aae64cb0604a8136bd1 X-Virus-Checked: Checked by ClamAV on apache.org --000325553aae64cb0604a8136bd1 Content-Type: text/plain; charset=ISO-8859-1 Hi stack: Server A or B is the same in the cluster. If I set hfile.block.cache.size=0.1 on other server, the problem will reappear.But When I set hfile.block.cache.size = 0.15 or more, it won't reappear. So I think you can test on your own cluster. With the follow btrace codes: -------------------------------------------------------------- import static com.sun.btrace.BTraceUtils.*; import com.sun.btrace.annotations.*; import java.nio.ByteBuffer; import org.apache.hadoop.hbase.io.hfile.*; @BTrace public class TestRegion1{ @OnMethod( clazz="org.apache.hadoop.hbase.io.hfile.HFile$Reader", method="decompress" ) public static void traceCacheBlock(final long offset, final int compressedSize, final int decompressedSize, final boolean pread){ println(strcat("decompress: ",str(decompressedSize))); } } -------------------------------------------------------------- If I set hfile.block.cache.size=0.1, the result is: ----------- ....... decompress: 6020488 decompress: 6022536 decompress: 5991304 decompress: 6283272 decompress: 5957896 decompress: 6246280 decompress: 6041096 decompress: 6541448 decompress: 6039560 ....... ----------- If I set hfile.block.cache.size=0.12, the result is: ----------- ...... decompress: 65775 decompress: 65556 decompress: 65552 decompress: 9914120 decompress: 6026888 decompress: 65615 decompress: 65627 decompress: 6247944 decompress: 5880840 decompress: 65646 ...... ----------- If I set hfile.block.cache.size=0.15 or more, the result is: ----------- ...... decompress: 65646 decompress: 65615 decompress: 65627 decompress: 65775 decompress: 65556 decompress: 65552 decompress: 65646 decompress: 65615 decompress: 65627 decompress: 65775 decompress: 65556 decompress: 65552 ...... ----------- All of above tests run more than 10 minutes in high level read speed. So it is very strange phenomenon. 2011/7/15 Stack > This is interesting. Any chance that the cells on the regions hosted > on server A are 5M in size? > > The hfile block sizes are by default configured to be 64k but rare > would an hfile block ever be exactly 64k. We do not cut the hfile > block content at 64k exactly. The hfile block boundary will be at a > keyvalue boundary. > > If a cell were 5MB, it does not get split across multiple hfile > blocks. It will occupy one hfile block. > > Could it be that the region hosted on A is not like the others and it > has lots of these 5MB sizes? > > Let us know. If above is not the case, then you have an interesting > phenomenon going on and we need to dig in more. > > St.Ack > > > On Thu, Jul 14, 2011 at 5:27 AM, Mingjian Deng > wrote: > > Hi: > > we found a strange problem in our read test. > > It is a 5 nodes cluster.Four of our 5 regionservers > > set hfile.block.cache.size=0.4, one of them is 0.1(node A). When we > random > > read from a 2TB data table we found node A's network reached 100MB, and > > others are less than 10MB. We kown node A need to read data from disks > and > > put them in blockcache. In the follow codes in LruBlockCache: > > > -------------------------------------------------------------------------------------------------------------------------- > > public void cacheBlock(String blockName, ByteBuffer buf, boolean > inMemory) > > { > > CachedBlock cb = map.get(blockName); > > if(cb != null) { > > throw new RuntimeException("Cached an already cached block"); > > } > > cb = new CachedBlock(blockName, buf, count.incrementAndGet(), > inMemory); > > long newSize = size.addAndGet(cb.heapSize()); > > map.put(blockName, cb); > > elements.incrementAndGet(); > > if(newSize > acceptableSize() && !evictionInProgress) { > > runEviction(); > > } > > } > > > -------------------------------------------------------------------------------------------------------------------------- > > > > > > > > > > We debugged this code with btrace like follow code: > > > -------------------------------------------------------------------------------------------------------------------------- > > import static com.sun.btrace.BTraceUtils.*; > > import com.sun.btrace.annotations.*; > > > > import java.nio.ByteBuffer; > > import org.apache.hadoop.hbase.io.hfile.*; > > > > @BTrace public class TestRegion{ > > @OnMethod( > > clazz="org.apache.hadoop.hbase.io.hfile.LruBlockCache", > > method="cacheBlock" > > ) > > public static void traceCacheBlock(@Self LruBlockCache instance,String > > blockName, ByteBuffer buf, boolean inMemory){ > > println(strcat("size: > > > ",str(get(field("org.apache.hadoop.hbase.io.hfile.LruBlockCache","size"),instance)))); > > println(strcat("elements: > > > ",str(get(field("org.apache.hadoop.hbase.io.hfile.LruBlockCache","elements"),instance)))); > > } > > } > > > -------------------------------------------------------------------------------------------------------------------------- > > > > > > > > We found that the "size" increace 5 MB each time in node A! Why not 64 > KB > > each time?? But the "size" increace 64 KB when we run this btrace code in > > other nodes at the same time. > > > > The follow codes also confirm the problem because the "decompressedSize" > > is 5 MB each time in node A! > > > ------------------------------------------------------------------------------------------------------------------------- > > import static com.sun.btrace.BTraceUtils.*; > > import com.sun.btrace.annotations.*; > > > > import java.nio.ByteBuffer; > > import org.apache.hadoop.hbase.io.hfile.*; > > > > @BTrace public class TestRegion1{ > > @OnMethod( > > clazz="org.apache.hadoop.hbase.io.hfile.HFile$Reader", > > method="decompress" > > ) > > public static void traceCacheBlock(final long offset, final int > > compressedSize, > > final int decompressedSize, final boolean pread){ > > println(strcat("decompressedSize: ",str(decompressedSize))); > > } > > } > > > ------------------------------------------------------------------------------------------------------------------------- > > > > > > > > Why not 64 KB? > > > > BTW: When we set hfile.block.cache.size=0.4 in node A, the > > "decompressedSize" down to 64 KB, and the tps is up to high level. > > > --000325553aae64cb0604a8136bd1--