Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 18FA5DE1B for ; Tue, 6 Nov 2012 07:40:41 +0000 (UTC) Received: (qmail 62888 invoked by uid 500); 6 Nov 2012 07:40:35 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 62635 invoked by uid 500); 6 Nov 2012 07:40:35 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 62613 invoked by uid 99); 6 Nov 2012 07:40:34 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Nov 2012 07:40:34 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of Serge.Blazhiyevskyy@nice.com designates 192.114.148.4 as permitted sender) Received: from [192.114.148.4] (HELO mailil.nice.com) (192.114.148.4) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Nov 2012 07:40:26 +0000 X-IronPort-AV: E=Sophos;i="4.80,721,1344200400"; d="scan'208";a="5000747" Received: from NAEXCHC2.na.nice.com (172.24.65.35) by tlvcas01.nice.com (192.168.253.111) with Microsoft SMTP Server (TLS) id 8.3.279.1; Tue, 6 Nov 2012 09:40:05 +0200 Received: from NAMAIL.na.nice.com ([172.24.65.31]) by NAEXCHC2.na.nice.com ([fe80::8dda:c989:576d:a45%17]) with mapi; Tue, 6 Nov 2012 00:40:03 -0700 From: Serge Blazhiyevskyy To: "user@hadoop.apache.org" , Bharath Mundlapudi Date: Tue, 6 Nov 2012 00:40:30 -0700 Subject: Re: backup of hdfs data Thread-Topic: backup of hdfs data Thread-Index: Ac278e6raSZhcVbZTvOmUdVYsYBiUQ== Message-ID: In-Reply-To: <1352178625.94746.YahooMailNeo@web110703.mail.gq1.yahoo.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.2.4.120824 acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org I second this proposed solution. Distcp work very well with backing up data= on the separate cluster From: Bharath Mundlapudi > Reply-To: "user@hadoop.apache.org" >, Bharath Mundlapudi > Date: Tuesday, November 6, 2012 7:10 AM To: "user@hadoop.apache.org" > Subject: Re: backup of hdfs data If data is less in your cluster (say less than few GBs) then answer is yes.= But it is an expensive route. For large data sets, traditional means is no= t feasible and it is expensive. If you want optimal cost based solution, you could setup another local/remo= te cluster and try discp or simply copy hdfs files to JBODs. Disk is cheap = :). -Bharath ________________________________ From: uday chopra > To: user@hadoop.apache.org Sent: Monday, November 5, 2012 4:19 PM Subject: backup of hdfs data What do folks do to backup hdfs data? Has anyone experience in trying to use enterprise solutions such as netback= up with datadomain D-2-D appliance for doing backups of data in hdfs? If so= , what is the average dedup ratio? (I understand mileage can vary based on = the type of data) Thanks, Uday