hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <la...@apache.org>
Subject Re: the scan will be executed parallel if not use coprocessor?
Date Mon, 15 Jul 2013 18:58:19 GMT
The HBase contract guarantees that rows are returned in row order.
That puts limits what can be done in parallel. For example one could farm out the requests
to the region servers in parallel, but the client would still have to wait for the rows that
sort first and deliver those to the client first.
We could add a new scan option that optionally allows to return rows out of order, in that
case the client could deliver the rows as they are retrieved.
In that case care must be taken that the parallel scanner behaves correctly when regions have
moved - currently the client scanner know how far it got in the scan, and just resets from
there; that part would be a bit more tricky in the parallel case.


-- Lars



----- Original Message -----
From: ramkrishna vasudevan <ramkrishna.s.vasudevan@gmail.com>
To: "user@hbase.apache.org" <user@hbase.apache.org>
Cc: 
Sent: Sunday, July 14, 2013 9:15 PM
Subject: Re: the scan will be executed parallel if not use coprocessor?

The HBase by default does not use parallel scanning mechanism.  It is
sequential.  There are some JIRA that try to implement scanning in parallel
on the regions.  HBASE-1935 is one such idea.
Projects like phoenix uses Coprocessors to scan the regions in parallel and
the results are returned to the clients.

Regards
Ram


On Mon, Jul 15, 2013 at 7:20 AM, ch huang <justlooks@gmail.com> wrote:

> phoenix is using coprocessor internal
>
> On Sun, Jul 14, 2013 at 11:15 PM, Asaf Mesika <asaf.mesika@gmail.com>
> wrote:
>
> > To my knowledge, scan is not parallel, hence the speed of queries of
> > Impala, Phoenix, and other similar projects.
> >
> > On Saturday, July 13, 2013, ch huang wrote:
> >
> > > hi ted ,for example i have a table with 10 regions, if i offer the
> > > condition hit the data of 8 regions,is it different do it use oraginal
> > scan
> > > and use coprocessor? i know coprocessor can do it parallel for each
> > region
> > > ,but why the oraginal scan will slow than coprocessor?
> > >
> > >
> > >
> > > On Sat, Jul 13, 2013 at 7:36 PM, Ted Yu <yuzhihong@gmail.com
> > <javascript:;>>
> > > wrote:
> > >
> > > > Can you clarify your question a little bit ?
> > > >
> > > > That is, are you expecting parallel scan within region boundary or
> > across
> > > > boundaries ?
> > > >
> > > > Cheers
> > > >
> > > > On Jul 13, 2013, at 1:43 AM, ch huang <justlooks@gmail.com
> > <javascript:;>>
> > > wrote:
> > > >
> > > > > ATT
> > > >
> > >
> >
> >
> > --
> > Sent from Gmail Mobile
> >
>


Mime
View raw message