hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pavan Sudheendra <pavan0...@gmail.com>
Subject Question about the time to execute joins in HBase!
Date Thu, 22 Aug 2013 15:25:09 GMT
Hi all,

A serious question.. I know this isn't one of the best hbase practices but
I really want to know..

I am doing a join across 3 table in hbase.. One table contain 19m records,
one contains 2m and another contains 1m records.

I'm doing this inside the mapper function.. I know this can be done with
pig and hive etc. Leaving the specifics out, how long would experts think
it would take for the mapper to finish aggregating them across a 6 node
cluster.. One is the job tracker and 5 are task trackers.. By the time I
see the map reduce job status for input records reach 600,000 it's taking
an hour.. It can't be right..

Any tips? Please help.



  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message