hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <>
Subject [jira] [Updated] (HIVE-8367) delete writes records in wrong order in some cases
Date Tue, 07 Oct 2014 16:13:34 GMT


Alan Gates updated HIVE-8367:
    Attachment: HIVE-8367.patch

The issue comes out when input sizes are large enough that they exceed one map task.
This patch fixes it by turning on reduce deduplication in the optimizer (which was being turned
off before) and dropping the minimum number of reducers to 1 (instead of 4). This has the
side effect of halving the time it takes to do an update or delete.

> delete writes records in wrong order in some cases
> --------------------------------------------------
>                 Key: HIVE-8367
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.14.0
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>            Priority: Blocker
>             Fix For: 0.14.0
>         Attachments: HIVE-8367.patch
> I have found one query with 10k records where you do:
> create table
> insert into table -- 10k records
> delete from table -- just some records
> The records in the delete delta are not ordered properly by rowid.
> I assume this applies to updates as well, but I haven't tested it yet.

This message was sent by Atlassian JIRA

View raw message