Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 871C5DD60 for ; Tue, 7 Aug 2012 08:32:14 +0000 (UTC) Received: (qmail 20656 invoked by uid 500); 7 Aug 2012 08:32:09 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 20051 invoked by uid 500); 7 Aug 2012 08:32:05 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 20017 invoked by uid 99); 7 Aug 2012 08:32:04 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Aug 2012 08:32:04 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FSL_RCVD_USER,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of harsh@cloudera.com designates 209.85.214.176 as permitted sender) Received: from [209.85.214.176] (HELO mail-ob0-f176.google.com) (209.85.214.176) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Aug 2012 08:31:59 +0000 Received: by obbtb18 with SMTP id tb18so9111183obb.35 for ; Tue, 07 Aug 2012 01:31:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:x-gm-message-state; bh=Uz+S+m6RNIPKRxm92T2D/aAVcqj0cqbT/Sl10N8MKb4=; b=UzaudPpLEFTWrhGlOs9/Iz6MQiYFwFbi1O7z7ubo2VOnqbySML5ONLYLMKkS8xMNjq FaVx0oSxqtCA2ahzzlK6J5UUBjFQjaG9YKG62bJZ8Z9qFRo/iKWRyRqfYjvNAZyGauKD 7RflWxgQngAZruVSkB6eCvr1DQNhAoUjKyj3I0FoTT2XhaOJfBsM2HIpc2sEcdd8D5Ua /0L4eM7L4x0GfNZDPLUHcsdxzJkXCzzgQmXkt9K7zz9RfRBBcS+7zE4BAQSI4yCMyBrE XawrxlFEgACvsDUtmY+5mU//LcLV54/dhVPrXNwsjaWOaSYvJqCa9xLp20/j9C5h8Mkk Rxmw== Received: by 10.182.216.99 with SMTP id op3mr23692444obc.30.1344328299141; Tue, 07 Aug 2012 01:31:39 -0700 (PDT) MIME-Version: 1.0 Received: by 10.76.83.198 with HTTP; Tue, 7 Aug 2012 01:31:19 -0700 (PDT) In-Reply-To: References: From: Harsh J Date: Tue, 7 Aug 2012 14:01:19 +0530 Message-ID: Subject: Re: Encrypting files in Hadoop - Using the io.compression.codecs To: user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQlLDv5qPNhDfe8S/ciupS5qpaytiRs5pY+a8xvwlSaGKEFieA0liYpfOZ9udvRHYRMaTJsW X-Virus-Checked: Checked by ClamAV on apache.org Farrokh, I do not know of a way to plug in a codec that applies to all files on HDFS transparently yet. Check out https://issues.apache.org/jira/browse/HDFS-2542 and friends for some work that may arrive in future. For HBase, by default, your choices are limited. You get only what HBase has tested to offer (None, LZO, GZ, Snappy) and adding in support for a new codec requires modification of sources. This is cause HBase uses an Enum of codec identifiers (to save space in its HFiles). But yes it can be done, and there're hackier ways of doing this too (Renaming your CryptoCodec to SnappyCodec for instance, to have HBase unknowingly use it, ugly ugly ugly). So yes, it is indeed best to discuss this need with the HBase community than the Hadoop one here. On Tue, Aug 7, 2012 at 1:43 PM, Farrokh Shahriari wrote: > Thanks, > What if I want to use this encryption in a cluster with hbase running on top > of hadoop? Can't hadoop be configured to automatically encrypt each file > which is going to be written on it? > If not I probably should be asking how to enable encryption on hbase, and > asking this question on the hbase mailing list, right? > > > On Tue, Aug 7, 2012 at 12:32 PM, Harsh J wrote: >> >> Farrokh, >> >> The codec org.apache.hadoop.io.compress.crypto.CyptoCodec needs to be >> used. What you've done so far is merely add it to be loaded by Hadoop >> at runtime, but you will need to use it in your programs if you wish >> for it to get applied. >> >> For example, for MapReduce outputs to be compressed, you may run an MR >> job with the following option set on its configuration: >> >> >> "-Dmapred.output.compression.codec=org.apache.hadoop.io.compress.crypto.CyptoCodec" >> >> And then you can notice that your output files were all properly >> encrypted with the above codec. >> >> Likewise, if you're using direct HDFS writes, you will need to wrap >> your outputstream with this codec. Look at the CompressionCodec API to >> see how: >> http://hadoop.apache.org/common/docs/stable/api/org/apache/hadoop/io/compress/CompressionCodec.html#createOutputStream(java.io.OutputStream) >> (Where your CompressionCodec must be the >> org.apache.hadoop.io.compress.crypto.CyptoCodec instance). >> >> On Tue, Aug 7, 2012 at 1:11 PM, Farrokh Shahriari >> wrote: >> > >> > Hello >> > I use "Hadoop Crypto Compressor" from this >> > site"https://github.com/geisbruch/HadoopCryptoCompressor" for encryption >> > hdfs files. >> > I've downloaded the complete code & create the jar file,Change the >> > propertise in core-site.xml as the site says. >> > But when I add a new file,nothing has happened & encryption isn't >> > working. >> > What can I do for encryption hdfs files ? Does anyone know how I should >> > use this class ? >> > >> > Tnx >> >> >> >> >> -- >> Harsh J > > -- Harsh J