Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 41743 invoked from network); 21 Nov 2006 23:51:25 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 21 Nov 2006 23:51:25 -0000 Received: (qmail 19372 invoked by uid 500); 21 Nov 2006 23:51:34 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 19337 invoked by uid 500); 21 Nov 2006 23:51:34 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 19328 invoked by uid 99); 21 Nov 2006 23:51:33 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 21 Nov 2006 15:51:33 -0800 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 21 Nov 2006 15:51:23 -0800 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 488197142CD for ; Tue, 21 Nov 2006 15:51:03 -0800 (PST) Message-ID: <10088137.1164153063293.JavaMail.jira@brutus> Date: Tue, 21 Nov 2006 15:51:03 -0800 (PST) From: "eric baldeschwieler (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Commented: (HADOOP-738) dfs get or copyToLocal should not copy crc file In-Reply-To: <18953484.1164060182168.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ http://issues.apache.org/jira/browse/HADOOP-738?page=comments#action_12451824 ] eric baldeschwieler commented on HADOOP-738: -------------------------------------------- The issues you raise are real when you are moving multi-terabyte search indexes and such. Having end-2-end CRC support is crucial and I'm all for keeping it in. But yes, our average use is not "sophisticated" in that their average operation to local disk in our environment is just to pull some experiment results locally. In general modern machines are good enough that megabyte to gigabyte sized files can be read and written reliably enough that CRC errors are not a dominate concern. Where they are a concern, I suggest we make the CRC files visible. Rather than using "dot" files, just append .crc to the file name. Then at least folks can see them and ask the right questions, etc. Things will work better that way. > dfs get or copyToLocal should not copy crc file > ----------------------------------------------- > > Key: HADOOP-738 > URL: http://issues.apache.org/jira/browse/HADOOP-738 > Project: Hadoop > Issue Type: Bug > Components: dfs > Affects Versions: 0.8.0 > Environment: all > Reporter: Milind Bhandarkar > Assigned To: Milind Bhandarkar > Fix For: 0.9.0 > > Attachments: hadoop-crc.patch > > > Currently, when we -get or -copyToLocal a directory from DFS, all the files including crc files are also copied. When we -put or -copyFromLocal again, since the crc files already exist on DFS, this put fails. The solution is not to copy checksum files when copying to local. Patch is forthcoming. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira