Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B44EE10F00 for ; Tue, 28 Jan 2014 18:53:44 +0000 (UTC) Received: (qmail 70870 invoked by uid 500); 28 Jan 2014 18:53:32 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 70686 invoked by uid 500); 28 Jan 2014 18:53:31 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 70679 invoked by uid 99); 28 Jan 2014 18:53:31 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 28 Jan 2014 18:53:31 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of tombrown52@gmail.com designates 74.125.82.52 as permitted sender) Received: from [74.125.82.52] (HELO mail-wg0-f52.google.com) (74.125.82.52) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 28 Jan 2014 18:53:26 +0000 Received: by mail-wg0-f52.google.com with SMTP id b13so1548495wgh.31 for ; Tue, 28 Jan 2014 10:53:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=clGpX/WvEJbdMYt18d4EMgCOAsjp/QhfwN2ezVOH7mA=; b=EXiX9Yt+q+Upn00Ld6RaVV/3sSAkKakU2y3X1iCNabqivXUun8QbLWNXZCrjYvSU8Y hMLq5C8nv2N4pNu2TbXrcr/fh9eX2iAJeZ7WbrOPPv+CNDpMokoBJBN9eXwyjIGEZVBx 2vAWgTLIxXNc0zSFW2CiCFjqNOUZCqGYU2NfTPYBzNMLX5ffmp9aLlok1cGhk4CzuBPp 6Ftaw+PPgAWa9X7OZqgUPG29yMUp9dVorvDn4CDuNa3c+Fbxy+wd7DnI8k4f03S6e9Us Vgpi/oq/DI3Jpfz6ak3aKaogJuMFgW7tS9IlUWPGV/8hjEdxWgT+zjTcOtdHVIr0sqoY 29NQ== MIME-Version: 1.0 X-Received: by 10.194.219.232 with SMTP id pr8mr2122942wjc.6.1390935185335; Tue, 28 Jan 2014 10:53:05 -0800 (PST) Received: by 10.194.59.142 with HTTP; Tue, 28 Jan 2014 10:53:05 -0800 (PST) Date: Tue, 28 Jan 2014 11:53:05 -0700 Message-ID: Subject: HDFS copyToLocal and get crc option From: Tom Brown To: "user@hadoop.apache.org" Content-Type: multipart/alternative; boundary=001a11c1b962a0a8ec04f10c55b0 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c1b962a0a8ec04f10c55b0 Content-Type: text/plain; charset=ISO-8859-1 I am archiving a large amount of data out of my HDFS file system to a separate shared storage solution (There is not much HDFS space left in my cluster, and upgrading it is not an option right now). I understand that HDFS internally manages checksums and won't succeed if the data doesn't match the CRC, so I'm not worried about corruption when reading from HDFS. However, I want to store the HDFS crc calculations alongside the data files after exporting them. I thought the "hadoop dfs -copyToLocal -crc " command would work, but it always gives me the error "-crc option is not valid when source file system does not have crc files" Can someone explain what exactly that option does, and when (if ever) it should be used? Thanks in advance! --Tom --001a11c1b962a0a8ec04f10c55b0 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
I am archiving a large amount of data out of my HDFS file = system to a separate shared storage solution (There is not much HDFS space = left in my cluster, and upgrading it is not an option right now).

I understand that HDFS internally manages checksums and won'= t succeed if the data doesn't match the CRC, so I'm not worried abo= ut corruption when reading from HDFS.

However, I w= ant to store the HDFS crc calculations alongside the data files after expor= ting them. I thought the "hadoop dfs -copyToLocal -crc <hdfs-source= > <local-dest>" command would work, but it always gives me th= e error "-crc option is not valid when source file system does not hav= e crc files"

Can someone explain what exactly that option does, and = when (if ever) it should be used?

Thanks in ad= vance!

--Tom
--001a11c1b962a0a8ec04f10c55b0--