hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Billy Pearson (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (HBASE-1287) Partitioner class not used in TableMapReduceUtil.initTableReduceJob()
Date Tue, 14 Apr 2009 01:26:16 GMT

    [ https://issues.apache.org/jira/browse/HBASE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698606#action_12698606
] 

Billy Pearson edited comment on HBASE-1287 at 4/13/09 6:25 PM:
---------------------------------------------------------------

yea that was my mess up should have used partitioner in the function but just hard linked
the Hregion one for no reasion.

The idea behind setting the number of reducers is to lower it if the number of reducers if
it is set > then we have regions the 
HRegionPartitioner will handle if the number of reduce is set lower then the number of regions
it just converts back to using default hashPartitioner
if greater then number of regions it will still work but will be a wast of launching reducers
that will have no work to do.

{code}
if (partitioner != null) {
      job.setPartitionerClass(HRegionPartitioner.class);
      HTable outputTable = new HTable(new HBaseConfiguration(job), table);
      int regions = outputTable.getRegionsInfo().size();
      if (job.getNumReduceTasks() > regions){
    	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
      }
    }
{code}

should be something like this

{code}
if (partitioner == HRegionPartitioner.class) {
      job.setPartitionerClass(HRegionPartitioner.class);
      HTable outputTable = new HTable(new HBaseConfiguration(job), table);
      int regions = outputTable.getRegionsInfo().size();
      if (job.getNumReduceTasks() > regions){
    	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
      }
  } else {
    job.setPartitionerClass(partitioner);
  }
{code}



      was (Author: viper799):
    yea that was my mess up should have used partitioner in the function but just hard linked
the Hregion one for no reasion.

The idea behind setting the number of reducers is to lower it if the number of reducers if
it is set > then we have regions the 
HRegionPartitioner will handle if the number of reduce is set lower then the number of regions
it just converts back to using default hashPartitioner
if greater then number of regions it will still work but will be a wast of launching reducers
that will have no work to do.

{code}
if (partitioner != null) {
      job.setPartitionerClass(HRegionPartitioner.class);
      HTable outputTable = new HTable(new HBaseConfiguration(job), table);
      int regions = outputTable.getRegionsInfo().size();
      if (job.getNumReduceTasks() > regions){
    	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
      }
    }
{code}

should be something like this

{code}
if (partitioner == HRegionPartitioner.class) {
      job.setPartitionerClass(HRegionPartitioner.class);
      HTable outputTable = new HTable(new HBaseConfiguration(job), table);
      int regions = outputTable.getRegionsInfo().size();
      if (job.getNumReduceTasks() > regions){
    	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
      }
  } else {
    job.setPartitionerClass(HRegionPartitioner.class);
  }
{code}


  
> Partitioner class not used in TableMapReduceUtil.initTableReduceJob()
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1287
>                 URL: https://issues.apache.org/jira/browse/HBASE-1287
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Lars George
>            Assignee: Lars George
>         Attachments: 1287-2.patch, 1287-3-patch.txt, 1287.patch
>
>
> Upon checking the available utility methods in TableMapReduceUtil I came across this
code
> {code}
>   public static void initTableReduceJob(String table,
>     Class<? extends TableReduce> reducer, JobConf job, Class partitioner)
>   throws IOException {
>     job.setOutputFormat(TableOutputFormat.class);
>     job.setReducerClass(reducer);
>     job.set(TableOutputFormat.OUTPUT_TABLE, table);
>     job.setOutputKeyClass(ImmutableBytesWritable.class);
>     job.setOutputValueClass(BatchUpdate.class);
>     if (partitioner != null) {
>       job.setPartitionerClass(HRegionPartitioner.class);
>       HTable outputTable = new HTable(new HBaseConfiguration(job), table);
>       int regions = outputTable.getRegionsInfo().size();
>       if (job.getNumReduceTasks() > regions){
>     	job.setNumReduceTasks(outputTable.getRegionsInfo().size());
>       }
>     }
>   }
> {code}
> It seems though as it should be
> {code}
>     if (partitioner != null) {
>       job.setPartitionerClass(partitioner);
> {code}
> and the provided HRegionPartitioner can be handed in to that call or a custom one can
be provided.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message