Return-Path: Delivered-To: apmail-hadoop-hbase-dev-archive@minotaur.apache.org Received: (qmail 76368 invoked from network); 25 Mar 2009 14:01:13 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 25 Mar 2009 14:01:13 -0000 Received: (qmail 62762 invoked by uid 500); 25 Mar 2009 14:01:13 -0000 Delivered-To: apmail-hadoop-hbase-dev-archive@hadoop.apache.org Received: (qmail 62740 invoked by uid 500); 25 Mar 2009 14:01:13 -0000 Mailing-List: contact hbase-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-dev@hadoop.apache.org Delivered-To: mailing list hbase-dev@hadoop.apache.org Received: (qmail 62730 invoked by uid 99); 25 Mar 2009 14:01:13 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 Mar 2009 14:01:13 +0000 X-ASF-Spam-Status: No, hits=-1999.6 required=10.0 tests=ALL_TRUSTED,SUBJECT_FUZZY_TION X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 Mar 2009 14:01:12 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 44CC8234C003 for ; Wed, 25 Mar 2009 07:00:52 -0700 (PDT) Message-ID: <1952290151.1237989652269.JavaMail.jira@brutus> Date: Wed, 25 Mar 2009 07:00:52 -0700 (PDT) From: "Lars George (JIRA)" To: hbase-dev@hadoop.apache.org Subject: [jira] Commented: (HBASE-1287) Partitioner class not used in TableMapReduceUtil.initTableReduceJob() In-Reply-To: <1703939054.1237983172521.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689115#action_12689115 ] Lars George commented on HBASE-1287: ------------------------------------ I was wondering about this too Michael and was about to ask. I am not sure who added it and why (I have yet to figure out how to search the file history for a specific change as opposed to browse them one by one). Anyhow, I do a similar thing, computing the number of mappers, in my main MR program. Why this is here and why it is executed just when a custom partitioner is given eludes me. Either you do it always, or never - or have an extra flag that triggers it. It also sets the upper limit only when the given number of reducers is higher than the actual regions. Why though? I am using this only to compute the number of mappers so that I do not have to specify this on the command line. All in all I would drop the whole thing if there is no reason to keep it and rather ask the caller to set the boundaries. Or add another helper method that computes these on demand, for example {code} public setNumMapTasks(String table, JobConf job) {...} public setNumReduceTasks(String table, JobConf job) {...} {code} where both simply do the same thing, get the number of region and assign that value to the respective field in the job configuration. > Partitioner class not used in TableMapReduceUtil.initTableReduceJob() > --------------------------------------------------------------------- > > Key: HBASE-1287 > URL: https://issues.apache.org/jira/browse/HBASE-1287 > Project: Hadoop HBase > Issue Type: Bug > Components: mapred > Reporter: Lars George > Assignee: Lars George > Attachments: 1287.patch > > > Upon checking the available utility methods in TableMapReduceUtil I came across this code > {code} > public static void initTableReduceJob(String table, > Class reducer, JobConf job, Class partitioner) > throws IOException { > job.setOutputFormat(TableOutputFormat.class); > job.setReducerClass(reducer); > job.set(TableOutputFormat.OUTPUT_TABLE, table); > job.setOutputKeyClass(ImmutableBytesWritable.class); > job.setOutputValueClass(BatchUpdate.class); > if (partitioner != null) { > job.setPartitionerClass(HRegionPartitioner.class); > HTable outputTable = new HTable(new HBaseConfiguration(job), table); > int regions = outputTable.getRegionsInfo().size(); > if (job.getNumReduceTasks() > regions){ > job.setNumReduceTasks(outputTable.getRegionsInfo().size()); > } > } > } > {code} > It seems though as it should be > {code} > if (partitioner != null) { > job.setPartitionerClass(partitioner); > {code} > and the provided HRegionPartitioner can be handed in to that call or a custom one can be provided. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.