hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pavan Sudheendra <pavan0...@gmail.com>
Subject Re: Question about the time to execute joins in HBase!
Date Thu, 22 Aug 2013 15:41:16 GMT
scan.setCaching(500);

I really don't understand this purpose though..


On Thu, Aug 22, 2013 at 9:09 PM, Kevin O'dell <kevin.odell@cloudera.com>wrote:

> QQ what is your caching set to?
> On Aug 22, 2013 11:25 AM, "Pavan Sudheendra" <pavan0591@gmail.com> wrote:
>
> > Hi all,
> >
> > A serious question.. I know this isn't one of the best hbase practices
> but
> > I really want to know..
> >
> > I am doing a join across 3 table in hbase.. One table contain 19m
> records,
> > one contains 2m and another contains 1m records.
> >
> > I'm doing this inside the mapper function.. I know this can be done with
> > pig and hive etc. Leaving the specifics out, how long would experts think
> > it would take for the mapper to finish aggregating them across a 6 node
> > cluster.. One is the job tracker and 5 are task trackers.. By the time I
> > see the map reduce job status for input records reach 600,000 it's taking
> > an hour.. It can't be right..
> >
> > Any tips? Please help.
> >
> > Thanks.
> >
> > --
> > Regards-
> > Pavan
> >
>



-- 
Regards-
Pavan

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message