hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Furcy Pin <pin.fu...@gmail.com>
Subject Re: ALTER TABLE DROP PARTITION not working on S3
Date Thu, 05 Apr 2018 13:57:39 GMT
Indeed.

If I remember correctly, s3 does not really have the concept of "folder"
like HDFS has, and Hive sort of makes up for it by creating a descriptor
file
where the partition "folder" is supposed to be. Maybe this is what is
missing here.

Perhaps you could try doing a "MSCK REPAIR TABLE tablename" to make sure
that the partitions are correctly loaded and then try again dropping that
particular partition?

Or look at your s3 folder if you see any such "partition folder file" and
check if it is missing for this particular partition?


On 5 April 2018 at 15:40, Richard A. Bross <rb@oaktreepeak.com> wrote:

> Leaving the column list out, here you go:
>
> # Detailed Table Information
> Database:               default
> Owner:                  hadoop
> CreateTime:             Thu Apr 05 13:24:33 UTC 2018
> LastAccessTime:         UNKNOWN
> Retention:              0
> Location:               s3://zoomi-proto-warehouse-measurements/
> Table Type:             MANAGED_TABLE
> Table Parameters:
>         COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
>         numFiles                10
>         numPartitions           7
>         numRows                 153
>         orc.compress            ZLIB
>         orc.create.index        true
>         rawDataSize             113563
>         totalSize               37801
>         transient_lastDdlTime   1522934673
>
> # Storage Information
> SerDe Library:          org.apache.hadoop.hive.ql.io.orc.OrcSerde
> InputFormat:            org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> OutputFormat:           org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
> Compressed:             No
> Num Buckets:            61
> Bucket Columns:         [crs_id]
> Sort Columns:           []
> Storage Desc Params:
>         serialization.format    1
> Time taken: 0.467 seconds, Fetched: 98 row(s)
>
>
> ----- Original Message -----
> From: "Furcy Pin" <pin.furcy@gmail.com>
> To: user@hive.apache.org
> Sent: Thursday, April 5, 2018 9:21:06 AM
> Subject: Re: ALTER TABLE DROP PARTITION not working on S3
>
>
> Hi Richard,
>
>
> could you please check if your table is EXTERNAL?
> You can see it with a "DESCRIBE FORMATTED table_name ;"
>
>
> That's what external tables are for, they don't delete underlying data
> when you drop them.
>
>
>
>
> On 5 April 2018 at 15:18, Richard A. Bross < rb@oaktreepeak.com > wrote:
>
>
> I think that someone put a file in there manually. Would that prevent Hive
> from dropping the partition. I also did a "drop table" and the s3 object
> keys persisted.
>
>
>
> ----- Original Message -----
> From: "Richard A. Bross" < rb@oaktreepeak.com >
> To: user@hive.apache.org
> Sent: Thursday, April 5, 2018 9:14:52 AM
> Subject: ALTER TABLE DROP PARTITION not working on S3
>
> Hi,
>
> I have a Hive managed table on S3, "api_measurements". I've tried dropping
> a partition like so:
>
> hive> alter table api_measurements drop if exists
> partition(daydate='2018-04-04', epoch=1522876500);
> Dropped the partition daydate=2018-04-04/epoch=1522876500
> OK
> Time taken: 2.109 seconds
>
> Yet the data is still on S3. Because object keys on S3 are always strings,
> I also tried this:
>
> hive> alter table api_measurements drop partition(daydate='2018-04-04',
> epoch='1522876500');
> OK
> Time taken: 0.135 seconds
>
> Yet the object keys and data are still there. I assume that Im missing
> something really simple. Can anyone shed some light on this?
>
> Thanks
>
>
>

Mime
View raw message