hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kiru Pakkirisamy <kirupakkiris...@yahoo.com>
Subject Re: Client Get vs Coprocessor scan performance
Date Sun, 18 Aug 2013 05:34:13 GMT
Ted,
On a table with 600K rows, Get'ting 100 rows seems to be faster than the FuzzyRowFilter (mask
on the whole length of the key). I thought the FuzzyRowFilter's  SEEK_NEXT_USING_HINT would
help.  All this on the client side, I have not changed my CoProcessor to use the FuzzyRowFilter
based on the client side performance (still doing multiple get inside the coprocessor). Also,
I am seeing very bad concurrent query performance. Are there any thing that would make Coprocessors
almost single threaded across multiple invocations ?
Again, all this after putting in 0.94.10 (for hbase-6870 sake) which seems to be very good
in bringing up the regions online fast and balanced. Thanks and much appreciated.
 
Regards,
- kiru


Kiru Pakkirisamy | webcloudtech.wordpress.com


________________________________
 From: Ted Yu <yuzhihong@gmail.com>
To: "user@hbase.apache.org" <user@hbase.apache.org> 
Sent: Saturday, August 17, 2013 4:19 PM
Subject: Re: Client Get vs Coprocessor scan performance
 

HBASE-6870 targeted whole table scanning for each coprocessorService call
which exhibited itself through:

HTable#coprocessorService -> getStartKeysInRange -> getStartEndKeys ->
getRegionLocations -> MetaScanner.allTableRegions(getConfiguration(),
getTableName(), false)

The cached region locations in HConnectionImplementation would be used.

Cheers


On Sat, Aug 17, 2013 at 2:21 PM, Asaf Mesika <asaf.mesika@gmail.com> wrote:

> Ted, can you elaborate a little bit why this issue boosts performance?
> I couldn't figure out from the issue comments if they execCoprocessor scans
> the entire .META. table or and entire table, to understand the actual
> improvement.
>
> Thanks!
>
>
>
>
> On Fri, Aug 9, 2013 at 8:44 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>
> > I think you need HBASE-6870 which went into 0.94.8
> >
> > Upgrading should boost coprocessor performance.
> >
> > Cheers
> >
> > On Aug 8, 2013, at 10:21 PM, Kiru Pakkirisamy <kirupakkirisamy@yahoo.com
> >
> > wrote:
> >
> > > Ted,
> > > Here is the method signature/protocol
> > > public Map<String, Double> getFooMap<String, Double> input,
> > > int topN) throws IOException;
> > >
> > > There are 31 regions on 4 nodes X 8 CPU.
> > > I am on 0.94.6 (from Hortonworks).
> > > I think it seems to behave like what linwukang says, - it is almost a
> > full table scan in the coprocessor.
> > > Actually, when I set more specific ColumnPrefixFilters performance went
> > down.
> > > I want to do things on the server side because, I dont want to be
> > sending 500K column/values to the client.
> > > I cannot believe a single-threaded client which does some calculations
> > and group-by  beats the coprocessor running in 31 regions.
> > >
> > > Regards,
> > > - kiru
> > >
> > >
> > > Kiru Pakkirisamy | webcloudtech.wordpress.com
> > >
> > >
> > > ________________________________
> > > From: Ted Yu <yuzhihong@gmail.com>
> > > To: user@hbase.apache.org; Kiru Pakkirisamy <kirupakkirisamy@yahoo.com
> >
> > > Sent: Thursday, August 8, 2013 8:40 PM
> > > Subject: Re: Client Get vs Coprocessor scan performance
> > >
> > >
> > > Can you give us a bit more information ?
> > >
> > > How do you deliver the 55 rowkeys to your endpoint ?
> > > How many regions do you have for this table ?
> > >
> > > What HBase version are you using ?
> > >
> > > Thanks
> > >
> > > On Thu, Aug 8, 2013 at 6:43 PM, Kiru Pakkirisamy
> > > <kirupakkirisamy@yahoo.com>wrote:
> > >
> > >> Hi,
> > >> I am finding an odd behavior with the Coprocessor performance lagging
> a
> > >> client side Get.
> > >> I have a table with 500000 rows. Each have variable # of columns in
> one
> > >> column family (in this case about 600000 columns in total are
> processed)
> > >> When I try to get specific 55 rows, the client side completes in
> > half-the
> > >> time as the coprocessor endpoint.
> > >> I am using  55 RowFilters on the Coprocessor scan side. The rows are
> > >> processed are exactly the same way in both the cases.
> > >> Any pointers on how to debug this scenario ?
> > >>
> > >> Regards,
> > >> - kiru
> > >>
> > >>
> > >> Kiru Pakkirisamy | webcloudtech.wordpress.com
> >
>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message