Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 63291 invoked from network); 6 Apr 2007 17:01:58 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 6 Apr 2007 17:01:58 -0000 Received: (qmail 25004 invoked by uid 500); 6 Apr 2007 17:02:01 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 24916 invoked by uid 500); 6 Apr 2007 17:02:00 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 24872 invoked by uid 99); 6 Apr 2007 17:02:00 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 Apr 2007 10:02:00 -0700 X-ASF-Spam-Status: No, hits=-100.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 Apr 2007 10:01:53 -0700 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id D66EC714074 for ; Fri, 6 Apr 2007 10:01:32 -0700 (PDT) Message-ID: <30768597.1175878892870.JavaMail.jira@brutus> Date: Fri, 6 Apr 2007 10:01:32 -0700 (PDT) From: "Owen O'Malley (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Commented: (HADOOP-819) LineRecordWriter should not always insert tab char between key and value In-Reply-To: <9893078.1165962805706.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12487278 ] Owen O'Malley commented on HADOOP-819: -------------------------------------- You can easily read the file into a String with TestMiniMRWithDFS.readOutput(Path,JobConf) and use assertEquals on the output to the expected value. > LineRecordWriter should not always insert tab char between key and value > ------------------------------------------------------------------------ > > Key: HADOOP-819 > URL: https://issues.apache.org/jira/browse/HADOOP-819 > Project: Hadoop > Issue Type: Improvement > Components: mapred > Reporter: Runping Qi > Assigned To: Runping Qi > Attachments: patch-819.txt > > > With the current implementation of LineRecordWriter in TextOutputFormat, the client cannot pass null key/or value to the write function, and a tab char is always inserted between the key and value. This works fine most time. However, in some > cases, one just does not want to have the extra tab char. A common example is that, if I need to implement a utility similar > to the unix sort with some fields in the lines as the sort key, I can have my map to extract the sort key from each line and pass the whole line as the value. The reducer just outputs the values and ignore the keys. However, if I use TextOutputFormat, my output will have an extra tab key in each of the lines, which is annoying. > A simple solution is that let the write function of LineRecordWriter accept null key argument, and write out the value only if the key is null. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.