Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E2717FB23 for ; Tue, 9 Apr 2013 03:13:04 +0000 (UTC) Received: (qmail 13768 invoked by uid 500); 9 Apr 2013 03:13:00 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 13506 invoked by uid 500); 9 Apr 2013 03:12:59 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 13475 invoked by uid 99); 9 Apr 2013 03:12:58 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Apr 2013 03:12:58 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of harsh@cloudera.com designates 209.85.223.173 as permitted sender) Received: from [209.85.223.173] (HELO mail-ie0-f173.google.com) (209.85.223.173) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Apr 2013 03:12:53 +0000 Received: by mail-ie0-f173.google.com with SMTP id 9so7958634iec.18 for ; Mon, 08 Apr 2013 20:12:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:content-type:content-transfer-encoding :x-gm-message-state; bh=vWl8L1sH98lZkwo/uqxlPlUNL0cR4jcxgwzY3rj+hGg=; b=b2LlbYpvRKS98anPmhjuOvWog7EFXvZ2+e5FgKhG8XywtOjHM2Ucuu9p375bR3p396 /v+iJ2jvCIchjFRMpEG5WvebcMAXZ7ViCQpY4xY6cWzG6J1nbTFbEtJXi6ScIlTvhLdO msqMUU31+JllGGDl5PWv0ezTJfIrBchjubnN3VgmRmL/7f/MrEKM7RoFzoPXs5/Yqypt In6uIcxJSeCfuZAEzvsAvI8tn1gwuyapEybkCSRwY+7/sJVrFS+9AOqj27AGcQrxzIzZ xnjqO5vB4XHX4ULQL+k8ZMIVlWUfTaw92PuJ5EgkMKhLyI0r7yIbv0mqPr8CKYIjT+EQ bhhA== X-Received: by 10.50.57.200 with SMTP id k8mr9087677igq.44.1365477153063; Mon, 08 Apr 2013 20:12:33 -0700 (PDT) MIME-Version: 1.0 Received: by 10.50.135.37 with HTTP; Mon, 8 Apr 2013 20:12:12 -0700 (PDT) In-Reply-To: <3F80E783-79EE-4D10-8E84-C5881B9D56EF@gmail.com> References: <3F80E783-79EE-4D10-8E84-C5881B9D56EF@gmail.com> From: Harsh J Date: Tue, 9 Apr 2013 08:42:12 +0530 Message-ID: Subject: Re: Best format to use To: "" Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Gm-Message-State: ALoCoQnlxYZdAqeA2XhrNTDhe3AD5oEu767iIAZybo6MZUJd5r/ZafiL3TsdH896Tv2XYjXp9HEv X-Virus-Checked: Checked by ClamAV on apache.org Hey Mark, Gzip codec creates extension .gzip, not .deflate (which is DeflateCodec). You may want to re-check your settings. Impala questions are best resolved at its current user and developer community at https://groups.google.com/a/cloudera.org/forum/#!forum/impala-= user. Impala does currently support LZO (and also Indexed LZO) compressed text files however, so you may want to try that as its splittable (compared to Gzip ones). On Tue, Apr 9, 2013 at 5:18 AM, Mark wrote: > Trying to determine what the best format to use for storing daily logs. W= e recently switch from snappy (.snappy) to gzip (.deflate) but I'm wonderin= g if there is something better? Our main clients for these daily logs are p= ig and hive using an external table. We were thinking about testing out imp= ala but we see that it doesn't work with compressed text files. Any suggest= ions? > > Thanks --=20 Harsh J