hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (Jira)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-23114) Insert overwrite with dynamic partitioning is not working correctly with direct insert
Date Wed, 08 Apr 2020 01:19:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-23114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17077726#comment-17077726
] 

Hive QA commented on HIVE-23114:
--------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  0s{color} |
{color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 19s{color} | {color:blue}
Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 39s{color}
| {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  3s{color} |
{color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 42s{color}
| {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 38s{color} | {color:blue}
ql in master has 1528 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 53s{color} |
{color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 28s{color} | {color:blue}
Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 25s{color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  1s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  1s{color} | {color:green}
the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 43s{color} | {color:red}
ql: The patch generated 1 new + 313 unchanged - 1 fixed = 314 total (was 314) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  0s{color} | {color:red}
The patch has 11 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>.
Refer https://git-scm.com/docs/git-apply {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  3m 53s{color} | {color:red}
ql generated 1 new + 1528 unchanged - 0 fixed = 1529 total (was 1528) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 51s{color} |
{color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 15s{color}
| {color:green} The patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 30s{color} | {color:black}
{color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  The field org.apache.hadoop.hive.ql.exec.FileSinkOperator.dynamicPartitionSpecs is transient
but isn't set by deserialization  In FileSinkOperator.java:but isn't set by deserialization
 In FileSinkOperator.java |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19)
x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-21500/dev-support/hive-personality.sh
|
| git revision | master / 189580e |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-21500/yetus/diff-checkstyle-ql.txt
|
| whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-21500/yetus/whitespace-eol.txt
|
| findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-21500/yetus/new-findbugs-ql.html
|
| modules | C: ql itests U: . |
| Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-21500/yetus.txt |
| Powered by | Apache Yetus    http://yetus.apache.org |


This message was automatically generated.



> Insert overwrite with dynamic partitioning is not working correctly with direct insert
> --------------------------------------------------------------------------------------
>
>                 Key: HIVE-23114
>                 URL: https://issues.apache.org/jira/browse/HIVE-23114
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Marta Kuczora
>            Assignee: Marta Kuczora
>            Priority: Major
>         Attachments: HIVE-23114.1.patch, HIVE-23114.2.patch
>
>
> This is a follow-up Jira for the [conversation|https://issues.apache.org/jira/browse/HIVE-21164?focusedCommentId=17059280&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17059280]
in HIVE-21164
>  Doing an insert overwrite from a multi-insert statement with dynamic partitioning will
give wrong results for ACID tables when 'hive.acid.direct.insert.enabled' is true or for insert-only
tables.
> Reproduction:
> {noformat}
> set hive.acid.direct.insert.enabled=true;
> set hive.support.concurrency=true;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> set hive.vectorized.execution.enabled=false;
> set hive.stats.autogather=false;
> create external table multiinsert_test_text (a int, b int, c int) stored as textfile;
> insert into multiinsert_test_text values (1111, 11, 1111), (2222, 22, 1111), (3333, 33,
2222), (4444, 44, NULL), (5555, 55, NULL);
> create table multiinsert_test_acid (a int, b int) partitioned by (c int) stored as orc
tblproperties('transactional'='true');
> create table multiinsert_test_mm (a int, b int) partitioned by (c int) stored as orc
tblproperties('transactional'='true', 'transactional_properties'='insert_only');
> from multiinsert_test_text a
> insert overwrite table multiinsert_test_acid partition (c)
> select
>  a.a,
>  a.b,
>  a.c
>  where a.c is not null
> insert overwrite table multiinsert_test_acid partition (c)
> select
>  a.a,
>  a.b,
>  a.c
> where a.c is null;
> select * from multiinsert_test_acid;
> from multiinsert_test_text a
> insert overwrite table multiinsert_test_mm partition (c)
> select
>  a.a,
>  a.b,
>  a.c
>  where a.c is not null
> insert overwrite table multiinsert_test_mm partition (c)
> select
>  a.a,
>  a.b,
>  a.c
> where a.c is null;
> select * from multiinsert_test_mm;
> {noformat}
> The result of these steps can be different, it depends on the execution order of the
FileSinkOperators of the insert overwrite statements. It can happen that an error occurs due
to manifest file collision, it can happen that no error occurs but the result will be incorrect.
>  Running the same insert query with an external table of with and ACID table with 'hive.acid.direct.insert.enabled=false'
will give the follwing result:
> {noformat}
> 1111    11      1111
> 2222    22      1111
> 3333    33      2222
> 4444    44      NULL
> 5555    55      NULL
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message