Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 63E73200AF5 for ; Thu, 2 Jun 2016 16:55:26 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 62376160A51; Thu, 2 Jun 2016 14:55:26 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 40A59160A3E for ; Thu, 2 Jun 2016 16:55:24 +0200 (CEST) Received: (qmail 39508 invoked by uid 500); 2 Jun 2016 14:55:22 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 39498 invoked by uid 99); 2 Jun 2016 14:55:22 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Jun 2016 14:55:22 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 52D59180218 for ; Thu, 2 Jun 2016 14:55:22 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.879 X-Spam-Level: * X-Spam-Status: No, score=1.879 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx2-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id 9RkIhXdFkWxb for ; Thu, 2 Jun 2016 14:55:17 +0000 (UTC) Received: from mail-yw0-f177.google.com (mail-yw0-f177.google.com [209.85.161.177]) by mx2-lw-eu.apache.org (ASF Mail Server at mx2-lw-eu.apache.org) with ESMTPS id 29ADB5F33F for ; Thu, 2 Jun 2016 14:55:16 +0000 (UTC) Received: by mail-yw0-f177.google.com with SMTP id h19so52277503ywc.0 for ; Thu, 02 Jun 2016 07:55:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to; bh=CUg7Yt3mtoABbunwqj0agNSYIrc1yTwgpt61noe2R98=; b=SKgVTcSk+UmC4OSZ9OINf2nqH4sgb+TwyKUIsBnYwBCUByKq+MS5qeSWdGeoS2w2Bv i7Sj1bnNY5OuBLwYHAV+Htql5M/8NcjJXwG8o+0cLS0BBCv2Xn6FHG8PEaL2fBVmsIjg HEfTW3M+FwfzF486JVWlbpRLec2zsfuprK25aDq4rdQTT+6E+BzHs5SNOZzQpk+8+MlE 1t+Jg6z0J9ucW9v940810xbhAQCnSCXpbbUnr6RSGNokmcIGzS8VW/Fl0UdxYIhaoP23 KMUJuJ100z6GCYCSgiARcc/FQfCAV9yIW3I/3kih6tQrThWKg9BPSmQYIJLL5aG23YEN ny/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to; bh=CUg7Yt3mtoABbunwqj0agNSYIrc1yTwgpt61noe2R98=; b=EF9GI8WMa1kpjUtq5uKP6oe6fA4u+8+HP/qZP+WaTazw4rJGzCZZaFkJtjdxAb2sFH CVloA0a4q1NjXUyyUnfc9K5xpVczeq+8m6oQmXSC3xKOBBvx2zSmO7cM/oeXx1pZMcve WhHQ5p48WWE2KoBUOyfZsbbwcpRZpojiyj4jHjavwEyuCNztyOqk7JlV7LTTfH++JP4V TPMGcE2ioUF8EG6R+zvOgVeakSNrIzf37Vu2IVQFPW/MEkiXU24dRWzgN1w5mFrUQ7dP Z0FgEOj9FiWCafczf8w5ZEyHXhQn83eklSZCjPl/A0GHbHjG5aJD1MJoZajyd8ZdR8N5 7WqA== X-Gm-Message-State: ALyK8tIaCPvutLd/U4WCKrvQrDHh3r4LncfKIJePOH3RRpmmKAdKF/fVdfje6pGW+L00gFs2IjX62rFR6faEUQ== MIME-Version: 1.0 X-Received: by 10.37.49.215 with SMTP id x206mr6191196ybx.0.1464879308816; Thu, 02 Jun 2016 07:55:08 -0700 (PDT) Received: by 10.129.124.213 with HTTP; Thu, 2 Jun 2016 07:55:08 -0700 (PDT) In-Reply-To: References: Date: Thu, 2 Jun 2016 10:55:08 -0400 Message-ID: Subject: Re: External partitoned table based on yyyy/mm/dd HDFS structure From: Igor Kravzov To: user@hive.apache.org Content-Type: multipart/alternative; boundary=001a11488ef2d7517605344cca34 archived-at: Thu, 02 Jun 2016 14:55:26 -0000 --001a11488ef2d7517605344cca34 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Thanks Dudu for the great explanation. I was doing some reading and thinking instead of complicated hierarchical structure to have flat one. Like /user/igor/data/date=3D2016-06-02 create external table t (i int) partitioned by (yyyymmdd date) location '/user/igor/data/'; or /user/igor/date=3D20160602 create external table t (i int) partitioned by (yyyymmdd int) location '/user/igor/data/'; Will it work? Also I will need to schedule msck repair table t; if I want partitions automatically picked up. Hive does not have this feature. Correct? What is the optimal directory size for a partition? Is about 2GB OK? On Wed, Jun 1, 2016 at 4:38 PM, Markovitz, Dudu wrote: > The short answer: > > In this naming convention it will require to specifically define each > partition. > > If the naming convention was yyyy=3D2016/mm=3D11/dd=3D28 instead of 2016/= 11/28 it > would have been straight forward. > > > > Dudu > > > > + The long answer: > > > > > -------------------------------------------------------------------------= ----------------------------------------------- > > -- bash > > > -------------------------------------------------------------------------= ----------------------------------------------- > > > > mkdir t > > mkdir t/2015 > > mkdir t/2015/01 > > mkdir t/2015/01/22 > > mkdir t/2015/01/23 > > mkdir t/2015/02 > > mkdir t/2015/02/17 > > mkdir t/2015/03 > > mkdir t/2015/03/04 > > mkdir t/2015/03/05 > > mkdir t/2015/03/06 > > mkdir t/2016 > > mkdir t/2016/10 > > mkdir t/2016/10/01 > > mkdir t/2016/10/02 > > mkdir t/2016/10/03 > > mkdir t/2016/11 > > mkdir t/2016/11/27 > > mkdir t/2016/11/28 > > > > > > echo -e "1\n2\n3" > t/2015/01/22/data.txt > > echo -e "4" > t/2015/01/23/data.txt > > echo -e "5\n6" > t/2015/02/17/data.txt > > echo -e "7\n8\n9" > t/2015/03/04/data.txt > > echo -e "10" > t/2015/03/05/data.txt > > echo -e "11\n12" > t/2015/03/06/data.txt > > echo -e "13\n14" > t/2016/10/01/data.txt > > echo -e "15\n16\n17\n18\n19" > t/2016/10/02/data.txt > > echo -e "20\n21" > t/2016/10/03/data.txt > > echo -e "22" > t/2016/11/27/data.txt > > echo -e "23\n24\n25" > t/2016/11/28/data.txt > > > > hdfs dfs -put t /user/dmarkovitz/t > > > > > > t > > =E2=94=9C=E2=94=80=E2=94=80 2015 > > =E2=94=82 =E2=94=9C=E2=94=80=E2=94=80 01 > > =E2=94=82 =E2=94=82 =E2=94=9C=E2=94=80=E2=94=80 22 > > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 data.txt > > =E2=94=82 =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 23 > > =E2=94=82 =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 data.txt > > =E2=94=82 =E2=94=9C=E2=94=80=E2=94=80 02 > > =E2=94=82 =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 17 > > =E2=94=82 =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 data.txt > > =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 03 > > =E2=94=82 =E2=94=9C=E2=94=80=E2=94=80 04 > > =E2=94=82 =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 data.txt > > =E2=94=82 =E2=94=9C=E2=94=80=E2=94=80 05 > > =E2=94=82 =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 data.txt > > =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 06 > > =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 data.txt > > =E2=94=94=E2=94=80=E2=94=80 2016 > > =E2=94=9C=E2=94=80=E2=94=80 10 > > =E2=94=82 =E2=94=9C=E2=94=80=E2=94=80 01 > > =E2=94=82 =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 data.txt > > =E2=94=82 =E2=94=9C=E2=94=80=E2=94=80 02 > > =E2=94=82 =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 data.txt > > =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 03 > > =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 data.txt > > =E2=94=94=E2=94=80=E2=94=80 11 > > =E2=94=9C=E2=94=80=E2=94=80 27 > > =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 data.txt > > =E2=94=94=E2=94=80=E2=94=80 28 > > =E2=94=94=E2=94=80=E2=94=80 data.txt > > > > > > > -------------------------------------------------------------------------= ----------------------------------------------- > > -- hive > > > -------------------------------------------------------------------------= ----------------------------------------------- > > > > set hive.mapred.supports.subdirectories=3Dtrue; > > set mapred.input.dir.recursive=3Dtrue; > > > > > -------------------------------------------------------------------------= ----------------------------------------------- > > -- t1: no partitions > > > -------------------------------------------------------------------------= ----------------------------------------------- > > > > create external table t1 (i int) location '/user/dmarkovitz/t'; > > > > select i,input__file__name from t1; > > > > i input__file__name > > 1 hdfs://horton/user/dmarkovitz/t/2015/01/22/data.txt > > 2 hdfs://horton/user/dmarkovitz/t/2015/01/22/data.txt > > 3 hdfs://horton/user/dmarkovitz/t/2015/01/22/data.txt > > 4 hdfs://horton/user/dmarkovitz/t/2015/01/23/data.txt > > 5 hdfs://horton/user/dmarkovitz/t/2015/02/17/data.txt > > 6 hdfs://horton/user/dmarkovitz/t/2015/02/17/data.txt > > 7 hdfs://horton/user/dmarkovitz/t/2015/03/04/data.txt > > 8 hdfs://horton/user/dmarkovitz/t/2015/03/04/data.txt > > 9 hdfs://horton/user/dmarkovitz/t/2015/03/04/data.txt > > 10 hdfs://horton/user/dmarkovitz/t/2015/03/05/data.txt > > 11 hdfs://horton/user/dmarkovitz/t/2015/03/06/data.txt > > 12 hdfs://horton/user/dmarkovitz/t/2015/03/06/data.txt > > 13 hdfs://horton/user/dmarkovitz/t/2016/10/01/data.txt > > 14 hdfs://horton/user/dmarkovitz/t/2016/10/01/data.txt > > 15 hdfs://horton/user/dmarkovitz/t/2016/10/02/data.txt > > 16 hdfs://horton/user/dmarkovitz/t/2016/10/02/data.txt > > 17 hdfs://horton/user/dmarkovitz/t/2016/10/02/data.txt > > 18 hdfs://horton/user/dmarkovitz/t/2016/10/02/data.txt > > 19 hdfs://horton/user/dmarkovitz/t/2016/10/02/data.txt > > 20 hdfs://horton/user/dmarkovitz/t/2016/10/03/data.txt > > 21 hdfs://horton/user/dmarkovitz/t/2016/10/03/data.txt > > 22 hdfs://horton/user/dmarkovitz/t/2016/11/27/data.txt > > 23 hdfs://horton/user/dmarkovitz/t/2016/11/28/data.txt > > 24 hdfs://horton/user/dmarkovitz/t/2016/11/28/data.txt > > 25 hdfs://horton/user/dmarkovitz/t/2016/11/28/data.txt > > > > > -------------------------------------------------------------------------= ----------------------------------------------- > > -- t2: 3 partition columns (yyyy smallint,mm tinyint,dd tinyint) > > > -------------------------------------------------------------------------= ----------------------------------------------- > > > > create external table t2 (i int) partitioned by (yyyy smallint,mm > tinyint,dd tinyint) location '/user/dmarkovitz/t'; > > > > alter table t2 add if not exists partition (yyyy=3D2015,mm=3D01,dd=3D22) > location '/user/dmarkovitz/t/2015/01/22'; > > alter table t2 add if not exists partition (yyyy=3D2015,mm=3D01,dd=3D23) > location '/user/dmarkovitz/t/2015/01/23'; > > alter table t2 add if not exists partition (yyyy=3D2015,mm=3D02,dd=3D17) > location '/user/dmarkovitz/t/2015/02/17'; > > alter table t2 add if not exists partition (yyyy=3D2015,mm=3D03,dd=3D04) > location '/user/dmarkovitz/t/2015/03/04'; > > alter table t2 add if not exists partition (yyyy=3D2015,mm=3D03,dd=3D05) > location '/user/dmarkovitz/t/2015/03/05'; > > alter table t2 add if not exists partition (yyyy=3D2015,mm=3D03,dd=3D06) > location '/user/dmarkovitz/t/2015/03/06'; > > alter table t2 add if not exists partition (yyyy=3D2016,mm=3D10,dd=3D01) > location '/user/dmarkovitz/t/2016/10/01'; > > alter table t2 add if not exists partition (yyyy=3D2016,mm=3D10,dd=3D02) > location '/user/dmarkovitz/t/2016/10/02'; > > alter table t2 add if not exists partition (yyyy=3D2016,mm=3D10,dd=3D03) > location '/user/dmarkovitz/t/2016/10/03'; > > alter table t2 add if not exists partition (yyyy=3D2016,mm=3D11,dd=3D27) > location '/user/dmarkovitz/t/2016/11/27'; > > alter table t2 add if not exists partition (yyyy=3D2016,mm=3D11,dd=3D28) > location '/user/dmarkovitz/t/2016/11/28'; > > > > > > select i,input__file__name,yyyy,mm,dd from t2 where yyyy=3D2016 and mm=3D= 10; > > > > i input__file__name yyyy mm dd > > 13 hdfs://horton/user/dmarkovitz/t/2016/10/01/data.txt 2016 10 1 > > 14 hdfs://horton/user/dmarkovitz/t/2016/10/01/data.txt 2016 10 1 > > 15 hdfs://horton/user/dmarkovitz/t/2016/10/02/data.txt 2016 10 2 > > 16 hdfs://horton/user/dmarkovitz/t/2016/10/02/data.txt 2016 10 2 > > 17 hdfs://horton/user/dmarkovitz/t/2016/10/02/data.txt 2016 10 2 > > 18 hdfs://horton/user/dmarkovitz/t/2016/10/02/data.txt 2016 10 2 > > 19 hdfs://horton/user/dmarkovitz/t/2016/10/02/data.txt 2016 10 2 > > 20 hdfs://horton/user/dmarkovitz/t/2016/10/03/data.txt 2016 10 3 > > 21 hdfs://horton/user/dmarkovitz/t/2016/10/03/data.txt 2016 10 3 > > > > > -------------------------------------------------------------------------= ----------------------------------------------- > > -- t3: 1 partition column (yyyymmdd date) > > > -------------------------------------------------------------------------= ----------------------------------------------- > > > > create external table t3 (i int) partitioned by (yyyymmdd date) location > '/user/dmarkovitz/t'; > > > > alter table t3 add if not exists partition (yyyymmdd=3Ddate '2015-01-22') > location '/user/dmarkovitz/t/2015/01/22'; > > alter table t3 add if not exists partition (yyyymmdd=3Ddate '2015-01-23') > location '/user/dmarkovitz/t/2015/01/23'; > > alter table t3 add if not exists partition (yyyymmdd=3Ddate '2015-02-17') > location '/user/dmarkovitz/t/2015/02/17'; > > alter table t3 add if not exists partition (yyyymmdd=3Ddate '2015-03-04') > location '/user/dmarkovitz/t/2015/03/04'; > > alter table t3 add if not exists partition (yyyymmdd=3Ddate '2015-03-05') > location '/user/dmarkovitz/t/2015/03/05'; > > alter table t3 add if not exists partition (yyyymmdd=3Ddate '2015-03-06') > location '/user/dmarkovitz/t/2015/03/06'; > > alter table t3 add if not exists partition (yyyymmdd=3Ddate '2016-10-01') > location '/user/dmarkovitz/t/2016/10/01'; > > alter table t3 add if not exists partition (yyyymmdd=3Ddate '2016-10-02') > location '/user/dmarkovitz/t/2016/10/02'; > > alter table t3 add if not exists partition (yyyymmdd=3Ddate '2016-10-03') > location '/user/dmarkovitz/t/2016/10/03'; > > alter table t3 add if not exists partition (yyyymmdd=3Ddate '2016-11-27') > location '/user/dmarkovitz/t/2016/11/27'; > > alter table t3 add if not exists partition (yyyymmdd=3Ddate '2016-11-28') > location '/user/dmarkovitz/t/2016/11/28'; > > > > > > select i,input__file__name,yyyymmdd from t3 where yyyymmdd between date > '2016-10-01' and date '2016-10-30'; > > > > i input__file__name yyyymmdd > > 13 hdfs://horton/user/dmarkovitz/t/2016/10/01/data.txt 2016-10-01 > > 14 hdfs://horton/user/dmarkovitz/t/2016/10/01/data.txt 2016-10-01 > > 15 hdfs://horton/user/dmarkovitz/t/2016/10/02/data.txt 2016-10-02 > > 16 hdfs://horton/user/dmarkovitz/t/2016/10/02/data.txt 2016-10-02 > > 17 hdfs://horton/user/dmarkovitz/t/2016/10/02/data.txt 2016-10-02 > > 18 hdfs://horton/user/dmarkovitz/t/2016/10/02/data.txt 2016-10-02 > > 19 hdfs://horton/user/dmarkovitz/t/2016/10/02/data.txt 2016-10-02 > > 20 hdfs://horton/user/dmarkovitz/t/2016/10/03/data.txt 2016-10-03 > > 21 hdfs://horton/user/dmarkovitz/t/2016/10/03/data.txt 2016-10-03 > > > > > -------------------------------------------------------------------------= ----------------------------------------------- > > -- Automatic partitions handling > > > -------------------------------------------------------------------------= ----------------------------------------------- > > > > There is a specific directories naming convention that is supported by > Hive partitioning which is {partition column}=3D{partition value} > > > > hdfs dfs -mv /user/dmarkovitz/t/2015/01/22 > /user/dmarkovitz/t/2015/01/dd=3D22 > > hdfs dfs -mv /user/dmarkovitz/t/2015/01/23 > /user/dmarkovitz/t/2015/01/dd=3D23 > > hdfs dfs -mv /user/dmarkovitz/t/2015/02/17 > /user/dmarkovitz/t/2015/02/dd=3D17 > > hdfs dfs -mv /user/dmarkovitz/t/2015/03/04 > /user/dmarkovitz/t/2015/03/dd=3D04 > > hdfs dfs -mv /user/dmarkovitz/t/2015/03/05 > /user/dmarkovitz/t/2015/03/dd=3D05 > > hdfs dfs -mv /user/dmarkovitz/t/2015/03/06 > /user/dmarkovitz/t/2015/03/dd=3D06 > > hdfs dfs -mv /user/dmarkovitz/t/2016/10/01 > /user/dmarkovitz/t/2016/10/dd=3D01 > > hdfs dfs -mv /user/dmarkovitz/t/2016/10/02 > /user/dmarkovitz/t/2016/10/dd=3D02 > > hdfs dfs -mv /user/dmarkovitz/t/2016/10/03 > /user/dmarkovitz/t/2016/10/dd=3D03 > > hdfs dfs -mv /user/dmarkovitz/t/2016/11/27 > /user/dmarkovitz/t/2016/11/dd=3D27 > > hdfs dfs -mv /user/dmarkovitz/t/2016/11/28 > /user/dmarkovitz/t/2016/11/dd=3D28 > > hdfs dfs -mv /user/dmarkovitz/t/2015/01 /user/dmarkovitz/t/2015/mm=3D= 01 > > hdfs dfs -mv /user/dmarkovitz/t/2015/02 /user/dmarkovitz/t/2015/mm=3D= 02 > > hdfs dfs -mv /user/dmarkovitz/t/2015/03 /user/dmarkovitz/t/2015/mm=3D= 03 > > hdfs dfs -mv /user/dmarkovitz/t/2016/10 /user/dmarkovitz/t/2016/mm=3D= 10 > > hdfs dfs -mv /user/dmarkovitz/t/2016/11 /user/dmarkovitz/t/2016/mm=3D= 11 > > hdfs dfs -mv /user/dmarkovitz/t/2015 /user/dmarkovitz/t/yyyy=3D201= 5 > > hdfs dfs -mv /user/dmarkovitz/t/2016 /user/dmarkovitz/t/yyyy=3D201= 6 > > > > t > > =E2=94=9C=E2=94=80=E2=94=80 yyyy=3D2015 > > =E2=94=82 =E2=94=9C=E2=94=80=E2=94=80 mm=3D01 > > =E2=94=82 =E2=94=82 =E2=94=9C=E2=94=80=E2=94=80 dd=3D22 > > =E2=94=82 =E2=94=82 =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 data.txt > > =E2=94=82 =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 dd=3D23 > > =E2=94=82 =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 data.txt > > =E2=94=82 =E2=94=9C=E2=94=80=E2=94=80 mm=3D02 > > =E2=94=82 =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 dd=3D17 > > =E2=94=82 =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 data.txt > > =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 mm=3D03 > > =E2=94=82 =E2=94=9C=E2=94=80=E2=94=80 dd=3D04 > > =E2=94=82 =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 data.txt > > =E2=94=82 =E2=94=9C=E2=94=80=E2=94=80 dd=3D05 > > =E2=94=82 =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 data.txt > > =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 dd=3D06 > > =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 data.txt > > =E2=94=94=E2=94=80=E2=94=80 yyyy=3D2016 > > =E2=94=9C=E2=94=80=E2=94=80 mm=3D10 > > =E2=94=82 =E2=94=9C=E2=94=80=E2=94=80 dd=3D01 > > =E2=94=82 =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 data.txt > > =E2=94=82 =E2=94=9C=E2=94=80=E2=94=80 dd=3D02 > > =E2=94=82 =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 data.txt > > =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 dd=3D03 > > =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 data.txt > > =E2=94=94=E2=94=80=E2=94=80 mm=3D11 > > =E2=94=9C=E2=94=80=E2=94=80 dd=3D27 > > =E2=94=82 =E2=94=94=E2=94=80=E2=94=80 data.txt > > =E2=94=94=E2=94=80=E2=94=80 dd=3D28 > > =E2=94=94=E2=94=80=E2=94=80 data.txt > > > > > > create external table t (i int) partitioned by (yyyy smallint,mm > tinyint,dd tinyint) location '/user/dmarkovitz/t'; > > msck repair table t; > > > > select i,input__file__name,yyyy,mm,dd from t where yyyy >=3D 2016 and mm = >=3D > 10; > > > > i input__file__name yyyy mm dd > > 13 hdfs://horton/user/dmarkovitz/t/yyyy=3D2016/mm=3D10/dd=3D01/data.txt= 2016 > 10 1 > > 14 hdfs://horton/user/dmarkovitz/t/yyyy=3D2016/mm=3D10/dd=3D01/data.txt= 2016 > 10 1 > > 15 hdfs://horton/user/dmarkovitz/t/yyyy=3D2016/mm=3D10/dd=3D02/data.txt= 2016 > 10 2 > > 16 hdfs://horton/user/dmarkovitz/t/yyyy=3D2016/mm=3D10/dd=3D02/data.txt= 2016 > 10 2 > > 17 hdfs://horton/user/dmarkovitz/t/yyyy=3D2016/mm=3D10/dd=3D02/data.txt= 2016 > 10 2 > > 18 hdfs://horton/user/dmarkovitz/t/yyyy=3D2016/mm=3D10/dd=3D02/data.txt= 2016 > 10 2 > > 19 hdfs://horton/user/dmarkovitz/t/yyyy=3D2016/mm=3D10/dd=3D02/data.txt= 2016 > 10 2 > > 20 hdfs://horton/user/dmarkovitz/t/yyyy=3D2016/mm=3D10/dd=3D03/data.txt= 2016 > 10 3 > > 21 hdfs://horton/user/dmarkovitz/t/yyyy=3D2016/mm=3D10/dd=3D03/data.txt= 2016 > 10 3 > > 22 hdfs://horton/user/dmarkovitz/t/yyyy=3D2016/mm=3D11/dd=3D27/data.txt= 2016 > 11 27 > > 23 hdfs://horton/user/dmarkovitz/t/yyyy=3D2016/mm=3D11/dd=3D28/data.txt= 2016 > 11 28 > > 24 hdfs://horton/user/dmarkovitz/t/yyyy=3D2016/mm=3D11/dd=3D28/data.txt= 2016 > 11 28 > > 25 hdfs://horton/user/dmarkovitz/t/yyyy=3D2016/mm=3D11/dd=3D28/data.txt= 2016 > 11 28 > > > > > > *From:* Igor Kravzov [mailto:igork.inexso@gmail.com] > *Sent:* Wednesday, June 01, 2016 6:03 PM > *To:* user@hive.apache.org > *Subject:* External partitoned table based on yyyy/mm/dd HDFS structure > > > > Is it possible to create? > > What would be the table definition in this case? > --001a11488ef2d7517605344cca34 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Thanks Dudu for the great explanation.=C2=A0
I was doi= ng some reading and thinking instead of complicated hierarchical structure = to have flat one.
Like=C2=A0

/user/igor/data/date=3D2016-06-02
create external table t (i int) = partitioned by (yyyymmdd date) location '/user/igor/data/';<= /div>
=C2=A0or
=C2=A0/user/igor/date=3D201606= 02
create extern= al table t (i int) partitioned by (yyyymmdd int) location '/user/igor/d= ata/';
Will it work?

Also I w= ill need to schedule=C2=A0msck repair table t;=C2=A0= if I want partitions automatically picked up. Hive does not have this featu= re. Correct?

What is the optimal directory s= ize for a partition? Is about 2GB OK?


On Wed, Jun 1, 2016 at 4:38= PM, Markovitz, Dudu <dmarkovitz@paypal.com> wrote:
<= blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px= #ccc solid;padding-left:1ex">

The short answer:

In this naming convention it will req= uire to specifically define each partition.

If the naming convention was yyyy=3D2016/mm=3D11/dd=3D28 instead of 2016/11/28 it would have been straight forward.

=C2=A0

Dudu

=C2=A0

+ The long answer:

=C2=A0

--------------------------------------------= ---------------------------------------------------------------------------= -

-- bash

--------------------------------------------= ---------------------------------------------------------------------------= -

=C2=A0

mkdir t

mkdir t/2015

mkdir t/2015/01

mkdir t/2015/01/22

mkdir t/2015/01/23

mkdir t/2015/02

mkdir t/2015/02/17

mkdir t/2015/03

mkdir t/2015/03/04

mkdir t/2015/03/05

mkdir t/2015/03/06

mkdir t/2016

mkdir t/2016/10

mkdir t/2016/10/01

mkdir t/2016/10/02

mkdir t/2016/10/03

mkdir t/2016/11

mkdir t/2016/11/27

mkdir t/2016/11/28

=C2=A0

=C2=A0

echo -e "1\n2\n3"=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 > = t/2015/01/22/data.txt

echo -e "4"=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0 > t/2015/01/23/data.txt

echo -e "5\n6"=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0 > t/2015/02/17/data.txt

echo -e "7\n8\n9"=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 > = t/2015/03/04/data.txt

echo -e "10"=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 > t/2015/03/05/data.txt

echo -e "11\n12"=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 &g= t; t/2015/03/06/data.txt

echo -e "13\n14"=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 &g= t; t/2016/10/01/data.txt

echo -e "15\n16\n17\n18\n19"=C2=A0= =C2=A0=C2=A0 > t/2016/10/02/data.txt

echo -e "20\n21"=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 &g= t; t/2016/10/03/data.txt

echo -e "22"=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 > t/2016/11/27/data.txt

echo -e "23\n24\n25"=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 > t/2016/11/28/da= ta.txt

=C2=A0

hdfs dfs -put t /user/dmarkovitz/t=

=C2=A0

=C2=A0

t

=E2=94=9C=E2=94=80=E2=94=80 2015

=E2=94=82=C2=A0=C2=A0 =E2=94=9C=E2=94=80=E2= =94=80 01

=E2=94=82=C2=A0=C2=A0 =E2=94=82=C2=A0=C2=A0 = =E2=94=9C=E2=94=80=E2=94=80 22

=E2=94=82=C2=A0=C2=A0 =E2=94=82=C2=A0=C2=A0 = =E2=94=82=C2=A0=C2=A0 =E2=94=94=E2=94=80=E2=94=80 data.txt

=E2=94=82=C2=A0=C2=A0 =E2=94=82=C2=A0=C2=A0 = =E2=94=94=E2=94=80=E2=94=80 23

=E2=94=82=C2=A0=C2=A0 =E2=94=82=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0 =E2=94=94=E2=94=80=E2=94=80 data.txt=

=E2=94=82=C2=A0=C2=A0 =E2=94=9C=E2=94=80=E2= =94=80 02

=E2=94=82=C2=A0=C2=A0 =E2=94=82=C2=A0=C2=A0 = =E2=94=94=E2=94=80=E2=94=80 17

=E2=94=82=C2=A0=C2=A0 =E2=94=82=C2=A0=C2=A0= =C2=A0=C2=A0 =C2=A0=C2=A0=E2=94=94=E2=94=80=E2=94=80 data.txt=

=E2=94=82=C2=A0=C2=A0 =E2=94=94=E2=94=80=E2= =94=80 03

=E2=94=82=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 =E2=94=9C=E2=94=80=E2=94=80 04

=E2=94=82=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 =E2=94=82=C2=A0=C2=A0 =E2=94=94=E2=94=80=E2=94=80 data.txt

=E2=94=82=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 =E2=94=9C=E2=94=80=E2=94=80 05

=E2=94=82=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 =E2=94=82=C2=A0=C2=A0 =E2=94=94=E2=94=80=E2=94=80 data.txt

=E2=94=82=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 =E2=94=94=E2=94=80=E2=94=80 06

=E2=94=82=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 =E2=94=94=E2=94=80=E2=94=80 data.txt<= /u>

=E2=94=94=E2=94=80=E2=94=80 2016

=C2=A0=C2=A0=C2=A0 =E2=94=9C=E2=94=80=E2=94= =80 10

=C2=A0=C2=A0=C2=A0 =E2=94=82=C2=A0=C2=A0 =E2= =94=9C=E2=94=80=E2=94=80 01

=C2=A0=C2=A0=C2=A0 =E2=94=82=C2=A0=C2=A0 =E2= =94=82=C2=A0=C2=A0 =E2=94=94=E2=94=80=E2=94=80 data.txt

=C2=A0=C2=A0=C2=A0 =E2=94=82=C2=A0=C2=A0 =E2= =94=9C=E2=94=80=E2=94=80 02

=C2=A0=C2=A0=C2=A0 =E2=94=82=C2=A0=C2=A0 =E2= =94=82=C2=A0=C2=A0 =E2=94=94=E2=94=80=E2=94=80 data.txt

=C2=A0=C2=A0=C2=A0 =E2=94=82=C2=A0=C2=A0 =E2= =94=94=E2=94=80=E2=94=80 03

=C2=A0=C2=A0=C2=A0 =E2=94=82=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 =E2=94=94=E2=94=80=E2=94=80 data.txt

=C2=A0=C2=A0=C2=A0 =E2=94=94=E2=94=80=E2=94= =80 11

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = =E2=94=9C=E2=94=80=E2=94=80 27

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = =E2=94=82=C2=A0=C2=A0 =E2=94=94=E2=94=80=E2=94=80 data.txt

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = =E2=94=94=E2=94=80=E2=94=80 28

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0 =E2=94=94=E2=94=80=E2=94=80 data.txt=

=C2=A0

=C2=A0

--------------------------------------------= ---------------------------------------------------------------------------= -

-- hive

--------------------------------------------= ---------------------------------------------------------------------------= -

=C2=A0

set hive.mapred.supports.subdirectories=3Dtr= ue;

set mapred.input.dir.recursive=3Dtrue;

=C2=A0

--------------------------------------------= ---------------------------------------------------------------------------= -

-- t1:=C2=A0=C2=A0=C2=A0=C2=A0 no partitions=

--------------------------------------------= ---------------------------------------------------------------------------= -

=C2=A0

create external table t1 (i int) location &#= 39;/user/dmarkovitz/t';

=C2=A0

select i,input__file__name from t1;

=C2=A0

i=C2=A0=C2=A0=C2=A0 input__file__name=

1=C2=A0=C2=A0=C2=A0 hdfs://horton/user/dmark= ovitz/t/2015/01/22/data.txt

2=C2=A0=C2=A0=C2=A0 hdfs://horton/user/dmark= ovitz/t/2015/01/22/data.txt

3=C2=A0=C2=A0=C2=A0 hdfs://horton/user/dmark= ovitz/t/2015/01/22/data.txt

4=C2=A0=C2=A0=C2=A0 hdfs://horton/user/dmark= ovitz/t/2015/01/23/data.txt

5=C2=A0=C2=A0=C2=A0 hdfs://horton/user/dmark= ovitz/t/2015/02/17/data.txt

6=C2=A0=C2=A0=C2=A0 hdfs://horton/user/dmark= ovitz/t/2015/02/17/data.txt

7=C2=A0=C2=A0=C2=A0 hdfs://horton/user/dmark= ovitz/t/2015/03/04/data.txt

8=C2=A0=C2=A0=C2=A0 hdfs://horton/user/dmark= ovitz/t/2015/03/04/data.txt

9=C2=A0=C2=A0=C2=A0 hdfs://horton/user/dmark= ovitz/t/2015/03/04/data.txt

10=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2015/03/05/data.txt

11=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2015/03/06/data.txt

12=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2015/03/06/data.txt

13=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/10/01/data.txt

14=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/10/01/data.txt

15=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/10/02/data.txt

16=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/10/02/data.txt

17=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/10/02/data.txt

18=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/10/02/data.txt

19=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/10/02/data.txt

20=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/10/03/data.txt

21=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/10/03/data.txt

22=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/11/27/data.txt

23=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/11/28/data.txt

24=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/11/28/data.txt

25=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/11/28/data.txt

=C2=A0

--------------------------------------------= ---------------------------------------------------------------------------= -

-- t2:=C2=A0=C2=A0=C2=A0=C2=A0 3 partition c= olumns (yyyy smallint,mm tinyint,dd tinyint)

--------------------------------------------= ---------------------------------------------------------------------------= -

=C2=A0

create external table t2 (i int) partitioned= by (yyyy smallint,mm tinyint,dd tinyint) location '/user/dmarkovitz/t&= #39;;

=C2=A0

alter table t2 add if not exists partition (= yyyy=3D2015,mm=3D01,dd=3D22) location '/user/dmarkovitz/t/2015/01/22= 9;;

alter table t2 add if not exists partition (= yyyy=3D2015,mm=3D01,dd=3D23) location '/user/dmarkovitz/t/2015/01/23= 9;;

alter table t2 add if not exists partition (= yyyy=3D2015,mm=3D02,dd=3D17) location '/user/dmarkovitz/t/2015/02/17= 9;;

alter table t2 add if not exists partition (= yyyy=3D2015,mm=3D03,dd=3D04) location '/user/dmarkovitz/t/2015/03/04= 9;;

alter table t2 add if not exists partition (= yyyy=3D2015,mm=3D03,dd=3D05) location '/user/dmarkovitz/t/2015/03/05= 9;;

alter table t2 add if not exists partition (= yyyy=3D2015,mm=3D03,dd=3D06) location '/user/dmarkovitz/t/2015/03/06= 9;;

alter table t2 add if not exists partition (= yyyy=3D2016,mm=3D10,dd=3D01) location '/user/dmarkovitz/t/2016/10/01= 9;;

alter table t2 add if not exists partition (= yyyy=3D2016,mm=3D10,dd=3D02) location '/user/dmarkovitz/t/2016/10/02= 9;;

alter table t2 add if not exists partition (= yyyy=3D2016,mm=3D10,dd=3D03) location '/user/dmarkovitz/t/2016/10/03= 9;;

alter table t2 add if not exists partition (= yyyy=3D2016,mm=3D11,dd=3D27) location '/user/dmarkovitz/t/2016/11/27= 9;;

alter table t2 add if not exists partition (= yyyy=3D2016,mm=3D11,dd=3D28) location '/user/dmarkovitz/t/2016/11/28= 9;;

=C2=A0

=C2=A0

select i,input__file__name,yyyy,mm,dd from t= 2 where yyyy=3D2016 and mm=3D10;

=C2=A0

i=C2=A0=C2=A0=C2=A0 input__file__name=C2=A0= =C2=A0=C2=A0 yyyy mm=C2=A0=C2=A0 dd

13=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/10/01/data.txt=C2=A0 2016 10=C2=A0=C2=A0 1

14=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/10/01/data.txt=C2=A0 2016 10=C2=A0=C2=A0 1

15=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/10/02/data.txt=C2=A0 2016 10=C2=A0=C2=A0 2

16=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/10/02/data.txt=C2=A0 2016 10=C2=A0=C2=A0 2

17=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/10/02/data.txt=C2=A0 2016 10=C2=A0=C2=A0 2

18=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/10/02/data.txt=C2=A0 2016 10=C2=A0=C2=A0 2

19=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/10/02/data.txt=C2=A0 2016 10=C2=A0=C2=A0 2

20=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/10/03/data.txt=C2=A0 2016 10=C2=A0=C2=A0 3

21=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/10/03/data.txt=C2=A0 2016 10=C2=A0=C2=A0 3

=C2=A0

--------------------------------------------= ---------------------------------------------------------------------------= -

-- t3:=C2=A0=C2=A0=C2=A0=C2=A0 1 partition c= olumn (yyyymmdd date)

--------------------------------------------= ---------------------------------------------------------------------------= -

=C2=A0

create external table t3 (i int) partitioned= by (yyyymmdd date) location '/user/dmarkovitz/t';

=C2=A0

alter table t3 add if not exists partition (= yyyymmdd=3Ddate '2015-01-22') location '/user/dmarkovitz/t/2015= /01/22';

alter table t3 add if not exists partition (= yyyymmdd=3Ddate '2015-01-23') location '/user/dmarkovitz/t/2015= /01/23';

alter table t3 add if not exists partition (= yyyymmdd=3Ddate '2015-02-17') location '/user/dmarkovitz/t/2015= /02/17';

alter table t3 add if not exists partition (= yyyymmdd=3Ddate '2015-03-04') location '/user/dmarkovitz/t/2015= /03/04';

alter table t3 add if not exists partition (= yyyymmdd=3Ddate '2015-03-05') location '/user/dmarkovitz/t/2015= /03/05';

alter table t3 add if not exists partition (= yyyymmdd=3Ddate '2015-03-06') location '/user/dmarkovitz/t/2015= /03/06';

alter table t3 add if not exists partition (= yyyymmdd=3Ddate '2016-10-01') location '/user/dmarkovitz/t/2016= /10/01';

alter table t3 add if not exists partition (= yyyymmdd=3Ddate '2016-10-02') location '/user/dmarkovitz/t/2016= /10/02';

alter table t3 add if not exists partition (= yyyymmdd=3Ddate '2016-10-03') location '/user/dmarkovitz/t/2016= /10/03';

alter table t3 add if not exists partition (= yyyymmdd=3Ddate '2016-11-27') location '/user/dmarkovitz/t/2016= /11/27';

alter table t3 add if not exists partition (= yyyymmdd=3Ddate '2016-11-28') location '/user/dmarkovitz/t/2016= /11/28';

=C2=A0

=C2=A0

select i,input__file__name,yyyymmdd from t3 = where yyyymmdd between date '2016-10-01' and date '2016-10-30&#= 39;;

=C2=A0

i=C2=A0=C2=A0=C2=A0 input__file__name=C2=A0= =C2=A0=C2=A0 yyyymmdd

13=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/10/01/data.txt=C2=A0 2016-10-01

14=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/10/01/data.txt=C2=A0 2016-10-01

15=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/10/02/data.txt=C2=A0 2016-10-02

16=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/10/02/data.txt=C2=A0 2016-10-02

17=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/10/02/data.txt=C2=A0 2016-10-02

18=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/10/02/data.txt=C2=A0 2016-10-02

19=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/10/02/data.txt=C2=A0 2016-10-02

20=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/10/03/data.txt=C2=A0 2016-10-03

21=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/2016/10/03/data.txt=C2=A0 2016-10-03

=C2=A0

--------------------------------------------= ---------------------------------------------------------------------------= -

-- Automatic partitions handling

--------------------------------------------= ---------------------------------------------------------------------------= -

=C2=A0

There is a specific directories naming conve= ntion that is supported by Hive partitioning which is {partition column}=3D= {partition value}

=C2=A0

hdfs dfs -mv /user/dmarkovitz/t/2015/01/22= =C2=A0 /user/dmarkovitz/t/2015/01/dd=3D22

hdfs dfs -mv /user/dmarkovitz/t/2015/01/23= =C2=A0 /user/dmarkovitz/t/2015/01/dd=3D23

hdfs dfs -mv /user/dmarkovitz/t/2015/02/17= =C2=A0 /user/dmarkovitz/t/2015/02/dd=3D17

hdfs dfs -mv /user/dmarkovitz/t/2015/03/04= =C2=A0 /user/dmarkovitz/t/2015/03/dd=3D04

hdfs dfs -mv /user/dmarkovitz/t/2015/03/05= =C2=A0 /user/dmarkovitz/t/2015/03/dd=3D05

hdfs dfs -mv /user/dmarkovitz/t/2015/03/06= =C2=A0 /user/dmarkovitz/t/2015/03/dd=3D06

hdfs dfs -mv /user/dmarkovitz/t/2016/10/01= =C2=A0 /user/dmarkovitz/t/2016/10/dd=3D01

hdfs dfs -mv /user/dmarkovitz/t/2016/10/02= =C2=A0 /user/dmarkovitz/t/2016/10/dd=3D02

hdfs dfs -mv /user/dmarkovitz/t/2016/10/03= =C2=A0 /user/dmarkovitz/t/2016/10/dd=3D03

hdfs dfs -mv /user/dmarkovitz/t/2016/11/27= =C2=A0 /user/dmarkovitz/t/2016/11/dd=3D27

hdfs dfs -mv /user/dmarkovitz/t/2016/11/28= =C2=A0 /user/dmarkovitz/t/2016/11/dd=3D28

hdfs dfs -mv /user/dmarkovitz/t/2015/01=C2= =A0=C2=A0=C2=A0=C2=A0 /user/dmarkovitz/t/2015/mm=3D01<= /p>

hdfs dfs -mv /user/dmarkovitz/t/2015/02=C2= =A0=C2=A0=C2=A0=C2=A0 /user/dmarkovitz/t/2015/mm=3D02<= /p>

hdfs dfs -mv /user/dmarkovitz/t/2015/03=C2= =A0=C2=A0=C2=A0=C2=A0 /user/dmarkovitz/t/2015/mm=3D03<= /p>

hdfs dfs -mv /user/dmarkovitz/t/2016/10=C2= =A0=C2=A0=C2=A0=C2=A0 /user/dmarkovitz/t/2016/mm=3D10<= /p>

hdfs dfs -mv /user/dmarkovitz/t/2016/11=C2= =A0=C2=A0=C2=A0=C2=A0 /user/dmarkovitz/t/2016/mm=3D11<= /p>

hdfs dfs -mv /user/dmarkovitz/t/2015=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 /user/dmarkovitz/t/yyyy=3D2015<= u>

hdfs dfs -mv /user/dmarkovitz/t/2016=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 /user/dmarkovitz/t/yyyy=3D2016<= u>

=C2=A0

t

=E2=94=9C=E2=94=80=E2=94=80 yyyy=3D2015

=E2=94=82=C2=A0=C2=A0 =E2=94=9C=E2=94=80=E2= =94=80 mm=3D01

=E2=94=82=C2=A0=C2=A0 =E2=94=82=C2=A0=C2=A0 = =E2=94=9C=E2=94=80=E2=94=80 dd=3D22

=E2=94=82=C2=A0=C2=A0 =E2=94=82=C2=A0=C2=A0 = =E2=94=82=C2=A0=C2=A0 =E2=94=94=E2=94=80=E2=94=80 data.txt

=E2=94=82=C2=A0=C2=A0 =E2=94=82=C2=A0=C2=A0 = =E2=94=94=E2=94=80=E2=94=80 dd=3D23

=E2=94=82=C2=A0=C2=A0 =E2=94=82=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0 =E2=94=94=E2=94=80=E2=94=80 data.txt=

=E2=94=82=C2=A0=C2=A0 =E2=94=9C=E2=94=80=E2= =94=80 mm=3D02

=E2=94=82=C2=A0=C2=A0 =E2=94=82=C2=A0=C2=A0 = =E2=94=94=E2=94=80=E2=94=80 dd=3D17

=E2=94=82=C2=A0=C2=A0 =E2=94=82=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0 =E2=94=94=E2=94=80=E2=94=80 data.txt=

=E2=94=82=C2=A0=C2=A0 =E2=94=94=E2=94=80=E2= =94=80 mm=3D03

=E2=94=82=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 =E2=94=9C=E2=94=80=E2=94=80 dd=3D04

=E2=94=82=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 =E2=94=82=C2=A0=C2=A0 =E2=94=94=E2=94=80=E2=94=80 data.txt

=E2=94=82=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 =E2=94=9C=E2=94=80=E2=94=80 dd=3D05

=E2=94=82=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 =E2=94=82=C2=A0=C2=A0 =E2=94=94=E2=94=80=E2=94=80 data.txt

=E2=94=82=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 =E2=94=94=E2=94=80=E2=94=80 dd=3D06

=E2=94=82=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 =E2=94=94=E2=94=80=E2=94=80 data.txt<= /u>

=E2=94=94=E2=94=80=E2=94=80 yyyy=3D2016

=C2=A0=C2=A0=C2=A0 =E2=94=9C=E2=94=80=E2=94= =80 mm=3D10

=C2=A0=C2=A0=C2=A0 =E2=94=82=C2=A0=C2=A0 =E2= =94=9C=E2=94=80=E2=94=80 dd=3D01

=C2=A0=C2=A0=C2=A0 =E2=94=82=C2=A0=C2=A0 =E2= =94=82=C2=A0=C2=A0 =E2=94=94=E2=94=80=E2=94=80 data.txt

=C2=A0=C2=A0=C2=A0 =E2=94=82=C2=A0=C2=A0 =E2= =94=9C=E2=94=80=E2=94=80 dd=3D02

=C2=A0=C2=A0=C2=A0 =E2=94=82=C2=A0=C2=A0 =E2= =94=82=C2=A0=C2=A0 =E2=94=94=E2=94=80=E2=94=80 data.txt

=C2=A0=C2=A0=C2=A0 =E2=94=82=C2=A0=C2=A0 =E2= =94=94=E2=94=80=E2=94=80 dd=3D03

=C2=A0=C2=A0=C2=A0 =E2=94=82=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 =E2=94=94=E2=94=80=E2=94=80 data.txt

=C2=A0=C2=A0=C2=A0 =E2=94=94=E2=94=80=E2=94= =80 mm=3D11

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = =E2=94=9C=E2=94=80=E2=94=80 dd=3D27

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = =E2=94=82=C2=A0=C2=A0 =E2=94=94=E2=94=80=E2=94=80 data.txt

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = =E2=94=94=E2=94=80=E2=94=80 dd=3D28

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0 =E2=94=94=E2=94=80=E2=94=80 data.txt=

=C2=A0

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0

create external table t (i int) partitioned = by (yyyy smallint,mm tinyint,dd tinyint) location '/user/dmarkovitz/t&#= 39;;

msck repair table t;

=C2=A0

select i,input__file__name,yyyy,mm,dd from t= where yyyy >=3D 2016 and mm >=3D 10;

=C2=A0

i=C2=A0=C2=A0=C2=A0 input__file__name=C2=A0= =C2=A0=C2=A0 yyyy mm=C2=A0=C2=A0 dd

13=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/yyyy=3D2016/mm=3D10/dd=3D01/data.txt=C2=A0 2016 10=C2=A0=C2=A0=C2=A0=C2= =A0 1

14=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/yyyy=3D2016/mm=3D10/dd=3D01/data.txt=C2=A0 2016 10=C2=A0=C2=A0=C2=A0=C2= =A0 1

15=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/yyyy=3D2016/mm=3D10/dd=3D02/data.txt=C2=A0 2016 10=C2=A0=C2=A0=C2=A0=C2= =A0 2

16=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/yyyy=3D2016/mm=3D10/dd=3D02/data.txt=C2=A0 2016 10=C2=A0=C2=A0=C2=A0=C2= =A0 2

17=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/yyyy=3D2016/mm=3D10/dd=3D02/data.txt=C2=A0 2016 10=C2=A0=C2=A0=C2=A0=C2= =A0 2

18=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/yyyy=3D2016/mm=3D10/dd=3D02/data.txt=C2=A0 2016 10=C2=A0=C2=A0=C2=A0=C2= =A0 2

19=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/yyyy=3D2016/mm=3D10/dd=3D02/data.txt=C2=A0 2016 10=C2=A0=C2=A0=C2=A0=C2= =A0 2

20=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/yyyy=3D2016/mm=3D10/dd=3D03/data.txt=C2=A0 2016 10=C2=A0=C2=A0=C2=A0=C2= =A0 3

21=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/yyyy=3D2016/mm=3D10/dd=3D03/data.txt=C2=A0 2016 10=C2=A0=C2=A0=C2=A0=C2= =A0 3

22=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/yyyy=3D2016/mm=3D11/dd=3D27/data.txt=C2=A0 2016 11=C2=A0=C2=A0=C2=A0=C2= =A0 27

23=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/yyyy=3D2016/mm=3D11/dd=3D28/data.txt=C2=A0 2016 11=C2=A0=C2=A0=C2=A0=C2= =A0 28

24=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/yyyy=3D2016/mm=3D11/dd=3D28/data.txt=C2=A0 2016 11=C2=A0=C2=A0=C2=A0=C2= =A0 28

25=C2=A0=C2=A0 hdfs://horton/user/dmarkovitz= /t/yyyy=3D2016/mm=3D11/dd=3D28/data.txt=C2=A0 2016 11=C2=A0=C2=A0=C2=A0=C2= =A0 28

=C2=A0

=C2=A0

From: Igor Kravzov [mailto:igork.inexso@gmail.com= ]
Sent: Wednesday, June 01, 2016 6:03 PM
To: user@h= ive.apache.org
Subject: External partitoned table based on yyyy/mm/dd HDFS structur= e

=C2=A0

Is it possible to create?

What would be the table definition in this case?=C2= =A0


--001a11488ef2d7517605344cca34--