Return-Path: Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: (qmail 83719 invoked from network); 21 Oct 2010 03:01:04 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 21 Oct 2010 03:01:04 -0000 Received: (qmail 98019 invoked by uid 500); 21 Oct 2010 03:01:03 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 97867 invoked by uid 500); 21 Oct 2010 03:01:03 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 97858 invoked by uid 99); 21 Oct 2010 03:01:03 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Oct 2010 03:01:03 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Oct 2010 03:01:01 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o9L30dAd009368 for ; Thu, 21 Oct 2010 03:00:39 GMT Message-ID: <21490731.25651287630039586.JavaMail.jira@thor> Date: Wed, 20 Oct 2010 23:00:39 -0400 (EDT) From: "Stu Hood (JIRA)" To: commits@cassandra.apache.org Subject: [jira] Issue Comment Edited: (CASSANDRA-1497) Add input support for Hadoop Streaming In-Reply-To: <11157989.155401284393572710.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/CASSANDRA-1497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923292#action_12923292 ] Stu Hood edited comment on CASSANDRA-1497 at 10/20/10 11:00 PM: ---------------------------------------------------------------- contrib/hadoop_streaming_input/bin/mapper.py * Mentions the original source multiple times, and claims to be both a mapper and reducer * I suspect that extract_text can be turned into a one-liner somehow contrib/hadoop_streaming_input/bin/reducer.py * Needs an Apache header contrib/hadoop_streaming_input/[input/]README.txt * Mentions "-input": {{bin/streaming}} should fake the input, and explain why * There is an extra copy of README.txt in an unused 'input' subdirectory .../hadoop/ColumnFamilyRecordReader.java * Indentation .../hadoop/streaming/AvroResolver.java * Updated javadoc I looked a little bit into the immediate runtime failure, but didn't come to any conclusions. One suspicious aspect is that Streaming appears to use the result of Resolver.getInputWriterClass to write to both the mapper and reducer scripts: see http://svn.apache.org/viewvc/hadoop/mapreduce/trunk/src/contrib/streaming/src/java/org/apache/hadoop/streaming/StreamJob.java?view=markup#l783 was (Author: stuhood): contrib/hadoop_streaming_input/bin/mapper.py * Mentions the original source multiple times, and claims to be both a mapper and reducer * I suspect that extract_text can be turned into a one-liner somehow contrib/hadoop_streaming_input/bin/reducer.py * Needs an Apache header contrib/hadoop_streaming_input/[input/]README.txt * Mentions "-input": {{bin/streaming}} should fake the input, and explain why * There is an extra copy of README.txt in an unused 'input' subdirectory .../hadoop/ColumnFamilyRecordReader.java * Indentation .../hadoop/streaming/AvroResolver.java * Updated javadoc I looked a little bit into the immediate runtime failure, but didn't come to any conclusions. One suspicious aspect is that Streaming appears to use the result of Resolver.getInputWriterClass to write to both the mapper and reducer scripts: see http://svn.apache.org/viewvc/hadoop/mapreduce/trunk/src/contrib/streaming/src/java/org/apache/hadoop/streaming/StreamJob.java?view=markup#l783 > Add input support for Hadoop Streaming > -------------------------------------- > > Key: CASSANDRA-1497 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1497 > Project: Cassandra > Issue Type: New Feature > Components: Hadoop > Reporter: Jeremy Hanna > Assignee: Jeremy Hanna > Fix For: 0.7.1 > > Attachments: 0001-An-updated-avro-based-input-streaming-solution.patch > > > related to CASSANDRA-1368 - create similar functionality for input streaming. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.