hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eugene Koifman (JIRA)" <>
Subject [jira] [Commented] (HIVE-20699) Query based compactor for full CRUD Acid tables
Date Mon, 04 Feb 2019 22:08:00 GMT


Eugene Koifman commented on HIVE-20699:

There are a few unused imports in SplitGrouper
HiveSplitGenerator has unused imports and
         if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_IN_TEZ_TEST)) {
            taskResource = Math.max(taskResource, 1);
what does this do?

> Query based compactor for full CRUD Acid tables
> -----------------------------------------------
>                 Key: HIVE-20699
>                 URL:
>             Project: Hive
>          Issue Type: New Feature
>          Components: Transactions
>    Affects Versions: 3.1.0
>            Reporter: Eugene Koifman
>            Assignee: Vaibhav Gumashta
>            Priority: Major
>         Attachments: HIVE-20699.1.patch, HIVE-20699.1.patch, HIVE-20699.10.patch, HIVE-20699.2.patch,
HIVE-20699.3.patch, HIVE-20699.4.patch, HIVE-20699.5.patch, HIVE-20699.6.patch, HIVE-20699.7.patch,
HIVE-20699.8.patch, HIVE-20699.9.patch
> Currently the Acid compactor is implemented as generated MR job ({{}}).
> It could also be expressed as a Hive query that reads from a given partition and writes
data back to the same partition.  This will merge the deltas and 'apply' the delete events.
 The simplest would be to just use Insert Overwrite but that will change all ROW__IDs which
we don't want.
> Need to implement this in a way that preserves ROW__IDs and creates a new {{base_x}}
directory to handle Major compaction.
> Minor compaction will be investigated separately.

This message was sent by Atlassian JIRA

View raw message