Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EDB71406A for ; Fri, 17 Jun 2011 18:11:11 +0000 (UTC) Received: (qmail 27256 invoked by uid 500); 17 Jun 2011 18:11:11 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 27221 invoked by uid 500); 17 Jun 2011 18:11:11 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 27052 invoked by uid 99); 17 Jun 2011 18:11:11 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Jun 2011 18:11:11 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Jun 2011 18:11:09 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 25C2E41EC25 for ; Fri, 17 Jun 2011 18:10:48 +0000 (UTC) Date: Fri, 17 Jun 2011 18:10:48 +0000 (UTC) From: "Jonathan Ellis (JIRA)" To: commits@cassandra.apache.org Message-ID: <1447259408.15860.1308334248151.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (CASSANDRA-1473) Implement a Cassandra aware Hadoop mapreduce.Partitioner MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/CASSANDRA-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051231#comment-13051231 ] Jonathan Ellis commented on CASSANDRA-1473: ------------------------------------------- This is a Hadoop partitioner, not a Cassandra partitioner. http://hadoop.apache.org/common/docs/r0.20.0/api/org/apache/hadoop/mapreduce/Partitioner.html > Implement a Cassandra aware Hadoop mapreduce.Partitioner > -------------------------------------------------------- > > Key: CASSANDRA-1473 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1473 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop > Reporter: Stu Hood > Assignee: Patricio Echague > Fix For: 1.0 > > > When using a IPartitioner that does not sort data in byte order (RandomPartitioner for example) with Cassandra's Hadoop integration, Hadoop is unaware of the output order of the data. > We can make Hadoop aware of the proper order of the output data by implementing Hadoop's mapreduce.Partitioner interface: then Hadoop will handle sorting all of the data according to Cassandra's IPartitioner, and the writing clients will be able to connect to smaller numbers of Cassandra nodes. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira