Return-Path: Delivered-To: apmail-hadoop-core-dev-archive@www.apache.org Received: (qmail 62619 invoked from network); 6 Jun 2008 17:22:07 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 6 Jun 2008 17:22:07 -0000 Received: (qmail 67571 invoked by uid 500); 6 Jun 2008 17:22:09 -0000 Delivered-To: apmail-hadoop-core-dev-archive@hadoop.apache.org Received: (qmail 67532 invoked by uid 500); 6 Jun 2008 17:22:09 -0000 Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-dev@hadoop.apache.org Received: (qmail 67519 invoked by uid 99); 6 Jun 2008 17:22:09 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 Jun 2008 10:22:08 -0700 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 Jun 2008 17:21:27 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 49DF6234C134 for ; Fri, 6 Jun 2008 10:21:45 -0700 (PDT) Message-ID: <1513209336.1212772905301.JavaMail.jira@brutus> Date: Fri, 6 Jun 2008 10:21:45 -0700 (PDT) From: "Koji Noguchi (JIRA)" To: core-dev@hadoop.apache.org Subject: [jira] Updated: (HADOOP-3460) SequenceFileAsBinaryOutputFormat In-Reply-To: <1512520752.1212005986397.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Noguchi updated HADOOP-3460: --------------------------------- Attachment: HADOOP-3460-part3.patch bq. 1. The testcase: doesn't need a main method, you might want to break up the check for forbidding record compression into a separate test, Separted the test into three. testbinary, testSequenceOutputClassDefaultsToMapRedOutputClass, and testcheckOutputSpecsForbidRecordCompression. Also, I had a bug in the testing such that checkOutputSpecs was throwing an exception because output path was not set and not because RECORD compression was being set. Fixed it. bq. and the call to JobConf::setInputPath is generating a warning (replace with FileInputFormat::addInputPath) Ah. I should have compiled with "-Djavac.args="-Xlint -Xmaxwarns 1000". Done. bq. 2. WritableValueBytes::writeCompressedBytes no longer throws IllegalArgumentException, so that can be removed from its signature I left it in since the original SequenceFile.ValueBytes has a signature {noformat} public void writeCompressedBytes(DataOutputStream outStream) throws IllegalArgumentException, IOException; {noformat} Should I still take it out? bq. 3. SeqFABOF::checkOutputSpecs doesn't need to list InvalidJobConfException Done. > SequenceFileAsBinaryOutputFormat > -------------------------------- > > Key: HADOOP-3460 > URL: https://issues.apache.org/jira/browse/HADOOP-3460 > Project: Hadoop Core > Issue Type: New Feature > Components: mapred > Reporter: Koji Noguchi > Assignee: Koji Noguchi > Priority: Minor > Attachments: HADOOP-3460-part1.patch, HADOOP-3460-part2.patch, HADOOP-3460-part3.patch > > > Add an OutputFormat to write raw bytes as keys and values to a SequenceFile. > In C++-Pipes, we're using SequenceFileAsBinaryInputFormat to read Sequencefiles. > However, we current don't have a way to *write* a sequencefile efficiently without going through extra (de)serializations. > I'd like to store the correct classnames for key/values but use BytesWritable to write > (in order for the next java or pig code to be able to read this sequencefile). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.