Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 31758 invoked from network); 22 Dec 2006 18:08:44 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 22 Dec 2006 18:08:44 -0000 Received: (qmail 85293 invoked by uid 500); 22 Dec 2006 18:08:51 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 85267 invoked by uid 500); 22 Dec 2006 18:08:51 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 85258 invoked by uid 99); 22 Dec 2006 18:08:51 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Dec 2006 10:08:51 -0800 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Dec 2006 10:08:43 -0800 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 16E94714293 for ; Fri, 22 Dec 2006 10:08:23 -0800 (PST) Message-ID: <19004047.1166810903091.JavaMail.jira@brutus> Date: Fri, 22 Dec 2006 10:08:23 -0800 (PST) From: "Doug Cutting (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Updated: (HADOOP-628) hadoop hdfs -cat replaces some characters with question marks. In-Reply-To: <16871.1161664156551.JavaMail.root@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ http://issues.apache.org/jira/browse/HADOOP-628?page=all ] Doug Cutting updated HADOOP-628: -------------------------------- Status: Open (was: Patch Available) I like that this patch uses a common loop for copying, but it loses the early-exit feature that 'cat' had. I think copyBytes should check for the error status if the output is a PrintStream. Perhaps something like: PrintStream ps = out instanceof PrintStream ? (PrintStream)out : null; ... while (...) { out.write(...); if (ps != null && ps.checkError()) { throw new IOException(...); ... } } This means that, when output is piped into another program and that program exits, the 'cat' or 'get' will exit too. So things like 'bin/hadoop fs -cat foo | head' will execute quickly. > hadoop hdfs -cat replaces some characters with question marks. > ---------------------------------------------------------------- > > Key: HADOOP-628 > URL: http://issues.apache.org/jira/browse/HADOOP-628 > Project: Hadoop > Issue Type: Bug > Components: dfs > Reporter: arkady borkovsky > Assigned To: Wendy Chien > Attachments: hadoop-628.patch > > > Should not the effect of > hadoop hdfs -get path local-file > and > hadoop hdfs -cat path >local-file > be the same? > Try to do this with a (hdfs) file that contains non-ascii characters and do a diff. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira