hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-8368) compactor is improperly writing delete records in base file
Date Tue, 07 Oct 2014 16:12:34 GMT

     [ https://issues.apache.org/jira/browse/HIVE-8368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alan Gates updated HIVE-8368:
-----------------------------
    Attachment: HIVE-8367.patch

The issue comes out when input sizes are large enough that they exceed one map task.  

This patch fixes it by turning on reduce deduplication in the optimizer (which was being turned
off before) and dropping the minimum number of reducers to 1 (instead of 4).  This has the
side effect of halving the time it takes to do an update or delete.

> compactor is improperly writing delete records in base file
> -----------------------------------------------------------
>
>                 Key: HIVE-8368
>                 URL: https://issues.apache.org/jira/browse/HIVE-8368
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>    Affects Versions: 0.14.0
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>            Priority: Critical
>             Fix For: 0.14.0
>
>         Attachments: HIVE-8367.patch
>
>
> When the compactor reads records from the base and deltas, it is not properly dropping
delete records.  This leads to oversized base files, and possibly to wrong query results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message