hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Asaf Mesika <asaf.mes...@gmail.com>
Subject Re: Read thruput
Date Thu, 04 Apr 2013 04:21:41 GMT
Can you possible batch some Get calls to a Scan with a Filter that contains
the list of row keys you need?
For example, if you have 100 Gets, you can create a start key and end key
from getting the max and mix from those 100 row keys list. Next, you need
to write a filter which saves this 100 row keys to a private member and
uses the hint method in the Filter interface to jump to the closest rowkey
in the region it scans.

If you need help with that I can add a more detailed description of that
Filter.

This should reduce most of the heavy weight over head processing of each
Get.

On Tuesday, April 2, 2013, Vibhav Mundra wrote:

> How does your client call looks like? Get? Scan? Filters?
> --My client keeps doing the Get request. Each time a single row is fetched.
> Essentially we are using Hbase as key value retrieval.
>
> Is 3000/sec is client side calls or is it in numbers of rows per sec?
> --3000/sec is the client side calls.
>
> If you measure in MB/sec how much read throughput do you get?
> --Each client request's response is at maximum 1 KB so its the MB/sec is
> 3MB { 3000 * 1 KB }.
>
> Where is your client located? Same router as the cluster?
> --It is routed on the same cluster, on the same subnet.
>
> Have you activated dfs read short circuit? Of not try it.
> --I have not tried this. Let me try this also.
>
> Compression - try switching to Snappy - should be faster.
> What else is running on the cluster parallel to your reading client?
> --There is small upload code running. I have never seen the CPU usage more
> than 5%, so actually didnt bother to look at this angle.
>
> -Vibhav
>
>
> On Tue, Apr 2, 2013 at 1:42 AM, Asaf Mesika <asaf.mesika@gmail.com> wrote:
>
> > How does your client call looks like? Get? Scan? Filters?
> > Is 3000/sec is client side calls or is it in numbers of rows per sec?
> > If you measure in MB/sec how much read throughput do you get?
> > Where is your client located? Same router as the cluster?
> > Have you activated dfs read short circuit? Of not try it.
> > Compression - try switching to Snappy - should be faster.
> > What else is running on the cluster parallel to your reading client?
> >
> > On Monday, April 1, 2013, Vibhav Mundra wrote:
> >
> > > What is the general read-thru put that one gets when using Hbase.
> > >
> > >  I am not to able to achieve more than 3000/secs with a timeout of 50
> > > millisecs.
> > > In this case also there is 10% of them are timing-out.
> > >
> > > -Vibhav
> > >
> > >
> > > On Mon, Apr 1, 2013 at 11:20 PM, Vibhav Mundra <mundra@gmail.com>
> wrote:
> > >
> > > > yes, I have changes the BLOCK CACHE % to 0.35.
> > > >
> > > > -Vibhav
> > > >
> > > >
> > > > On Mon, Apr 1, 2013 at 10:20 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> > > >
> > > >> I was aware of that discussion which was about MAX_FILESIZE and
> > > BLOCKSIZE
> > > >>
> > > >> My suggestion was about block cache percentage.
> > > >>
> > > >> Cheers
> > > >>
> > > >>
> > > >> On Mon, Apr 1, 2013 at 4:57 AM, Vibhav Mundra <mundra@gmail.com>
> > wrote:
> > > >>
> > > >> > I have used the following site:
> > > >> > http://grokbase.com/t/hbase/user/11bat80x7m/row-get-very-slow
> > > >> >
> > > >> > to lessen the value of block cache.
> > > >> >
> > > >> > -Vibhav
> > > >> >
> > > >> >
> > > >> > On Mon, Apr 1, 2013 at 4:23 PM, Ted <yuzhihong@gmail.com>
wrote:
> > > >> >
> > > >> > > Can you increase block cache size ?
> > > >> > >
> > > >> > > What version of hbase are you using ?
> > > >> > >
> > > >> > > Thanks
> > > >> > >
> > > >> > > On Apr 1, 2013, at 3:47 AM, Vibhav Mundra <mundra@gmail.com>
> > wrote:
> > > >> > >
> > > >> > > > The typical size of each of my row is less than 1KB.
> > > >> > > >
> > > >> > > > Regarding the memory, I have used 8GB for Hbase regionservers
> > and
> > > 4
> > > >> GB
> > > >> > > for
> > > >> > > > datanodes and I dont see them completely used. So I
ruled out
> > the
> > > GC
> > > >> > > aspect.
> > > >> > > >
> > > >> > > > In case u still believe that GC is an issue, I will
upload the
> > gc
> > > >> logs.
> > > >> > > >
> > > >> > > > -Vibhav
> > > >> > > >
> > > >> > > >
> > > >> > > > On Mon, Apr 1, 2013 at 3:46 PM, ramkrishna vasudevan
<
> > > >> > > > ramkrishna.s.vasudevan@gmail.com> wrote:
> > > >> > > >
> > > >> > > >> Hi
> > > >> > > >>
> > > >> > > >> How big is your row?  Are they wider rows and what
would be
> the
> > > >> size
> > > >> > of
> > > >> > > >> every cell?
> > > >> > > >> How many read threads are getting used?
> > > >> > > >>
> > > >> > > >>
> > > >

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message