hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Panayotis Antonopoulos <antonopoulos...@hotmail.com>
Subject HFiles created by MR Jobs and HBase Performance
Date Tue, 17 May 2011 12:16:39 GMT

I am writing a MR job where each reducer will output one HFile containing some of the rows
of the table that will be created.
At first I thought to use the HashPartitioner to achieve load balancing, but this would mix
the rows and the output of each reducer will not be a continuous part of the Hbase table that
will be created combining all these files.

So, I would like to ask you if it is important to use a Partitioner (TotalOrderPartitioner,
for example) that will allow the reducers to have a continuous part of the table?

If I do not do that, will this ruin the performance of HBase when executing queries or when
it runs compactions, as rows, which are supposed to be next to each other, will be in different
HFiles and the number of disk seeks will increase?

Thank you for your help!
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message