Mailing-List: contact user-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hbase.apache.org
Received-SPF: pass (athena.apache.org: domain of mundra@gmail.com designates
 209.85.214.48 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAMpwW3qSXewpFZOi6FZ6Lmn9cw=Fk7UnRR9JTCGOxB6uxP6spA@mail.gmail.com>
References: 
 <CAMpwW3peAin8yqwB=exP9fgq=k7ygbLpPmdMXmRYbdfYa4RuzA@mail.gmail.com>
	<CAAT7Mkr453pyyu-xnih=BEhzzfbOQhsWonS0-NzqNbrGFDsPpg@mail.gmail.com>
	<CAMpwW3o6t2aSDcq8ZFP6cjcd46oNKz_WNK_yGaRh_esTfpvZ-w@mail.gmail.com>
	<1277447C-C19D-444B-A861-6651106D54B1@gmail.com>
	<CAMpwW3r8JuMgNkT4fsW5+SNu2xmdXNbqCNM2i87UM-M84qriFA@mail.gmail.com>
	<CALte62xcNjaY16wfZ3R=E5nyj3m9K5tg7Z6Dd860KMMOmrJn6g@mail.gmail.com>
	<CAMpwW3qSXewpFZOi6FZ6Lmn9cw=Fk7UnRR9JTCGOxB6uxP6spA@mail.gmail.com>
Date: Mon, 1 Apr 2013 23:29:18 +0530
Message-ID: 
 <CAMpwW3r5DgW=9fG7_NirmA7mW=je1mqNPhK=KRNhW1-BM=eVCA@mail.gmail.com>
Subject: Re: Read thruput
From: Vibhav Mundra <mundra@gmail.com>
To: user@hbase.apache.org
Content-Type: multipart/alternative; boundary=bcaec529933f34b37b04d9506155

--bcaec529933f34b37b04d9506155
Content-Type: text/plain; charset=ISO-8859-1

What is the general read-thru put that one gets when using Hbase.

 I am not to able to achieve more than 3000/secs with a timeout of 50
millisecs.
In this case also there is 10% of them are timing-out.

-Vibhav


On Mon, Apr 1, 2013 at 11:20 PM, Vibhav Mundra <mundra@gmail.com> wrote:

> yes, I have changes the BLOCK CACHE % to 0.35.
>
> -Vibhav
>
>
> On Mon, Apr 1, 2013 at 10:20 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>
>> I was aware of that discussion which was about MAX_FILESIZE and BLOCKSIZE
>>
>> My suggestion was about block cache percentage.
>>
>> Cheers
>>
>>
>> On Mon, Apr 1, 2013 at 4:57 AM, Vibhav Mundra <mundra@gmail.com> wrote:
>>
>> > I have used the following site:
>> > http://grokbase.com/t/hbase/user/11bat80x7m/row-get-very-slow
>> >
>> > to lessen the value of block cache.
>> >
>> > -Vibhav
>> >
>> >
>> > On Mon, Apr 1, 2013 at 4:23 PM, Ted <yuzhihong@gmail.com> wrote:
>> >
>> > > Can you increase block cache size ?
>> > >
>> > > What version of hbase are you using ?
>> > >
>> > > Thanks
>> > >
>> > > On Apr 1, 2013, at 3:47 AM, Vibhav Mundra <mundra@gmail.com> wrote:
>> > >
>> > > > The typical size of each of my row is less than 1KB.
>> > > >
>> > > > Regarding the memory, I have used 8GB for Hbase regionservers and 4
>> GB
>> > > for
>> > > > datanodes and I dont see them completely used. So I ruled out the GC
>> > > aspect.
>> > > >
>> > > > In case u still believe that GC is an issue, I will upload the gc
>> logs.
>> > > >
>> > > > -Vibhav
>> > > >
>> > > >
>> > > > On Mon, Apr 1, 2013 at 3:46 PM, ramkrishna vasudevan <
>> > > > ramkrishna.s.vasudevan@gmail.com> wrote:
>> > > >
>> > > >> Hi
>> > > >>
>> > > >> How big is your row?  Are they wider rows and what would be the
>> size
>> > of
>> > > >> every cell?
>> > > >> How many read threads are getting used?
>> > > >>
>> > > >>
>> > > >> Were you able to take a thread dump when this was happening?  Have
>> you
>> > > seen
>> > > >> the GC log?
>> > > >> May be need some more info before we can think of the problem.
>> > > >>
>> > > >> Regards
>> > > >> Ram
>> > > >>
>> > > >>
>> > > >> On Mon, Apr 1, 2013 at 3:39 PM, Vibhav Mundra <mundra@gmail.com>
>> > wrote:
>> > > >>
>> > > >>> Hi All,
>> > > >>>
>> > > >>> I am trying to use Hbase for real-time data retrieval with a
>> timeout
>> > of
>> > > >> 50
>> > > >>> ms.
>> > > >>>
>> > > >>> I am using 2 machines as datanode and regionservers,
>> > > >>> and one machine as a master for hadoop and Hbase.
>> > > >>>
>> > > >>> But I am able to fire only 3000 queries per sec and 10% of them
>> are
>> > > >> timing
>> > > >>> out.
>> > > >>> The database has 60 million rows.
>> > > >>>
>> > > >>> Are these figure okie, or I am missing something.
>> > > >>> I have used the scanner caching to be equal to one, because for
>> each
>> > > time
>> > > >>> we are fetching a single row only.
>> > > >>>
>> > > >>> Here are the various configurations:
>> > > >>>
>> > > >>> *Our schema
>> > > >>> *{NAME => 'mytable', FAMILIES => [{NAME => 'cf',
>> DATA_BLOCK_ENCODING
>> > =>
>> > > >>> 'NONE', BLOOMFILTER => 'ROWCOL', REPLICATION_SCOPE => '0',
>> > COMPRESSION
>> > > =>
>> > > >>> 'GZ', VERSIONS => '1', TTL => '2147483647', MIN_VERSIONS => '0',
>> KEE
>> > > >>> P_DELETED_CELLS => 'false', BLOCKSIZE => '8192', ENCODE_ON_DISK =>
>> > > >> 'true',
>> > > >>> IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}
>> > > >>>
>> > > >>> *Configuration*
>> > > >>> 1 Machine having both hbase and hadoop master
>> > > >>> 2 machines having both region server node and datanode
>> > > >>> total 285 region servers
>> > > >>>
>> > > >>> *Machine Level Optimizations:*
>> > > >>> a)No of file descriptors is 1000000(ulimit -n gives 1000000)
>> > > >>> b)Increase the read-ahead value to 4096
>> > > >>> c)Added noatime,nodiratime to the disks
>> > > >>>
>> > > >>> *Hadoop Optimizations:*
>> > > >>> dfs.datanode.max.xcievers = 4096
>> > > >>> dfs.block.size = 33554432
>> > > >>> dfs.datanode.handler.count = 256
>> > > >>> io.file.buffer.size = 65536
>> > > >>> hadoop data is split on 4 directories, so that different disks are
>> > > being
>> > > >>> accessed
>> > > >>>
>> > > >>> *Hbase Optimizations*:
>> > > >>>
>> > > >>> hbase.client.scanner.caching=1  #We have specifcally added this,
>> as
>> > we
>> > > >>> return always one row.
>> > > >>> hbase.regionserver.handler.count=3200
>> > > >>> hfile.block.cache.size=0.35
>> > > >>> hbase.hregion.memstore.mslab.enabled=true
>> > > >>> hfile.min.blocksize.size=16384
>> > > >>> hfile.min.blocksize.size=4
>> > > >>> hbase.hstore.blockingStoreFiles=200
>> > > >>> hbase.regionserver.optionallogflushinterval=60000
>> > > >>> hbase.hregion.majorcompaction=0
>> > > >>> hbase.hstore.compaction.max=100
>> > > >>> hbase.hstore.compactionThreshold=100
>> > > >>>
>> > > >>> *Hbase-GC
>> > > >>> *-XX:+UseConcMarkSweepGC -XX:+UseParNewGC
>> > -XX:+CMSParallelRemarkEnabled
>> > > >>> -XX:SurvivorRatio=20 -XX:ParallelGCThreads=16
>> > > >>> *Hadoop-GC*
>> > > >>> -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
>> > > >>>
>> > > >>> -Vibhav
>> > > >>
>> > >
>> >
>>
>
>

--bcaec529933f34b37b04d9506155--