hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Marc Spaggiari <jean-m...@spaggiari.org>
Subject Re: Scan vs map-reduce
Date Mon, 14 Apr 2014 12:09:45 GMT
This might help you: http://phoenix.incubator.apache.org/

JM
Le 2014-04-14 07:53, "Li Li" <fancyerii@gmail.com> a écrit :

> I need to get about 20,000 rows from the table. the table is about
> 1,000,000 rows.
> my first version is using 20,000 Get and I found it's very slow. So I
> modified it to a scan and filter unrelated rows in the client.
> maybe I should write a coprocessor. btw, is there any filter available
> for me? something like sql statement where rowkey in('abc', 'abd'
> ....). a very long in statement
>
> On Mon, Apr 14, 2014 at 7:46 PM, Jean-Marc Spaggiari
> <jean-marc@spaggiari.org> wrote:
> > Hi Li Li,
> >
> > If you have more than one region, might be useful. MR will scan all the
> > regions in parallel. If you do a full scan from a client API with no
> > parallelism, then the MR job might be faster. But it will take more
> > resources on the cluster and might impact the SLA of the other clients,
> if
> > any,
> >
> > JM
> >
> >
> > 2014-04-14 2:42 GMT-04:00 Mohammad Tariq <dontariq@gmail.com>:
> >
> >> Well, it depends. Could you please provide some more details?It will
> help
> >> us in giving a proper answer.
> >>
> >> Warm Regards,
> >> Tariq
> >> cloudfront.blogspot.com
> >>
> >>
> >> On Mon, Apr 14, 2014 at 11:38 AM, Li Li <fancyerii@gmail.com> wrote:
> >>
> >> > I have a full table scan which cost about 10 minutes. it seems a
> >> > bottleneck for our application. if use map-reduce to rewrite it. will
> >> > it be faster?
> >> >
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message