hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wei Zheng (JIRA)" <>
Subject [jira] [Commented] (HIVE-17361) Support LOAD DATA for transactional tables
Date Wed, 23 Aug 2017 21:52:00 GMT


Wei Zheng commented on HIVE-17361:

LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1,
partcol2=val2 ...)]
Unlike non-ACID table, if the table is bucketed and there are more than 1 bucket file, then
LOAD DATA on ACID table will require 'filepath' to refer to a directory not a file. Otherwise,
one may end up having a bucket file in one load_delta directory and another bucket file in
a different load_delta directory.

The reason behind this is:
a) For a non-ACID table, say tbl1, one can continue loading files into the same table via
consecutive LOAD commands, that will just result more and more files under tbl1/ directory
b) However, for a non-ACID table, since a new load_delta directory will be created every time
when LOAD DATA is run, consecutive LOAD commands will create separate subdirectories for every
single file, which may not be desirable, e.g. if one wants to load a file for one bucket,
and then a file for another bucket, those two files will reside in two different load_delta

> Support LOAD DATA for transactional tables
> ------------------------------------------
>                 Key: HIVE-17361
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>            Reporter: Wei Zheng
>            Assignee: Wei Zheng
>         Attachments: HIVE-17361.1.patch
> LOAD DATA was not supported since ACID was introduced. Need to fill this gap between
ACID table and regular hive table.

This message was sent by Atlassian JIRA

View raw message