Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4628FF685 for ; Mon, 1 Apr 2013 10:17:22 +0000 (UTC) Received: (qmail 18922 invoked by uid 500); 1 Apr 2013 10:17:20 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 18861 invoked by uid 500); 1 Apr 2013 10:17:20 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 18839 invoked by uid 99); 1 Apr 2013 10:17:19 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Apr 2013 10:17:19 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ramkrishna.s.vasudevan@gmail.com designates 209.85.128.44 as permitted sender) Received: from [209.85.128.44] (HELO mail-qe0-f44.google.com) (209.85.128.44) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Apr 2013 10:17:12 +0000 Received: by mail-qe0-f44.google.com with SMTP id x7so1121545qeu.3 for ; Mon, 01 Apr 2013 03:16:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=nmkKZJzxtqTs6SeBb0pqUSBCu3jdUDNsBzh1kI9ySIo=; b=oYhyrermoY0b3+3TNyjVHItAVIEJajqu2CW5gzoG2g/y3EL7+nbD0emEgjRZNx8mG1 qyW18jXgK6Eaf3hKzMD1k5z3F2Q66DTZkZDQMK3vdiEywpEOexBT9VNJOodGuH7F7wD3 oVykr+c1YN7OZuhYOEveG2DVEySJ6jyo4JkRAoJgy6uASwUfmuqAuOqQ5o5X5ohrQfcY aYowZFNIo/XUo+TApIY/SNHRYBmVlB7nrc/YeGexbblzV8fxdLdg+yF++9tM9SDF00EB U3nkQvRMgqS33OOwAb80M3XBc6NEL3OFLfBcQuD2Ua+1uu4jh2sT515AfJM1+FPW7fge pNJQ== MIME-Version: 1.0 X-Received: by 10.49.61.164 with SMTP id q4mr13291788qer.60.1364811412079; Mon, 01 Apr 2013 03:16:52 -0700 (PDT) Received: by 10.49.15.68 with HTTP; Mon, 1 Apr 2013 03:16:51 -0700 (PDT) In-Reply-To: References: Date: Mon, 1 Apr 2013 15:46:51 +0530 Message-ID: Subject: Re: Read thruput From: ramkrishna vasudevan To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=047d7bd74da267090a04d949eb31 X-Virus-Checked: Checked by ClamAV on apache.org --047d7bd74da267090a04d949eb31 Content-Type: text/plain; charset=ISO-8859-1 Hi How big is your row? Are they wider rows and what would be the size of every cell? How many read threads are getting used? Were you able to take a thread dump when this was happening? Have you seen the GC log? May be need some more info before we can think of the problem. Regards Ram On Mon, Apr 1, 2013 at 3:39 PM, Vibhav Mundra wrote: > Hi All, > > I am trying to use Hbase for real-time data retrieval with a timeout of 50 > ms. > > I am using 2 machines as datanode and regionservers, > and one machine as a master for hadoop and Hbase. > > But I am able to fire only 3000 queries per sec and 10% of them are timing > out. > The database has 60 million rows. > > Are these figure okie, or I am missing something. > I have used the scanner caching to be equal to one, because for each time > we are fetching a single row only. > > Here are the various configurations: > > *Our schema > *{NAME => 'mytable', FAMILIES => [{NAME => 'cf', DATA_BLOCK_ENCODING => > 'NONE', BLOOMFILTER => 'ROWCOL', REPLICATION_SCOPE => '0', COMPRESSION => > 'GZ', VERSIONS => '1', TTL => '2147483647', MIN_VERSIONS => '0', KEE > P_DELETED_CELLS => 'false', BLOCKSIZE => '8192', ENCODE_ON_DISK => 'true', > IN_MEMORY => 'false', BLOCKCACHE => 'true'}]} > > *Configuration* > 1 Machine having both hbase and hadoop master > 2 machines having both region server node and datanode > total 285 region servers > > *Machine Level Optimizations:* > a)No of file descriptors is 1000000(ulimit -n gives 1000000) > b)Increase the read-ahead value to 4096 > c)Added noatime,nodiratime to the disks > > *Hadoop Optimizations:* > dfs.datanode.max.xcievers = 4096 > dfs.block.size = 33554432 > dfs.datanode.handler.count = 256 > io.file.buffer.size = 65536 > hadoop data is split on 4 directories, so that different disks are being > accessed > > *Hbase Optimizations*: > > hbase.client.scanner.caching=1 #We have specifcally added this, as we > return always one row. > hbase.regionserver.handler.count=3200 > hfile.block.cache.size=0.35 > hbase.hregion.memstore.mslab.enabled=true > hfile.min.blocksize.size=16384 > hfile.min.blocksize.size=4 > hbase.hstore.blockingStoreFiles=200 > hbase.regionserver.optionallogflushinterval=60000 > hbase.hregion.majorcompaction=0 > hbase.hstore.compaction.max=100 > hbase.hstore.compactionThreshold=100 > > *Hbase-GC > *-XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+CMSParallelRemarkEnabled > -XX:SurvivorRatio=20 -XX:ParallelGCThreads=16 > *Hadoop-GC* > -XX:+UseConcMarkSweepGC -XX:+UseParNewGC > > -Vibhav > --047d7bd74da267090a04d949eb31--