Sushanth Sowmyan (JIRA)
Subject [jira] [Commented] (HIVE-10228) Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics
Date Thu, 16 Apr 2015 22:31:59 GMT


Sushanth Sowmyan commented on HIVE-10228:

Sorry, yeah, this is a big patch. :)

It's really a cumulative patch of a bunch of work, but a lot of that was overwriting itself
so much that splitting them out into a bunch of patches would have been difficult. Forking
hive to do dev of this on a separate branch and merging in one go might have been easier.

I'd created as a doc jira, and I've attached
a presentation-like document there outlining various points of why we're doing a bunch of
what we're doing, but that still needs some wiki-fication that I am working on. I've also
attached the replay-protocol document on that jira after updating it slightly with your question
on DROP TABLE here.

I'll reply to code-level comments on review board, and reply to your higher-level comments

There are a couple of cases this can happen in:

a) To make it more resilient in cases of parallelization of events (in the cases of a worker
that times out and does not respond back, for eg., but might still be running, albeit slowly
in the background), one of the goals of all Commands generated by Replication is that they
should be idempotent, and reprocessing of events older than the state of an object should
not cause any error. So, if one drone that's processing events (41,42,43) might perform 41
and then not respond back for a significant amount of time, causing Falcon to queue another
HiveDR job that starts performing (41,42,43), and 43 might return successfully before the
other job performs 42, and then failing. So, one of the early design goals was that all commands
should be resilient to repeats. This is a way of achieving that goal.

since the REPL(CREATE1) occurs after CREATE2, it picks up a newer state of the table, and
the destination is at a newer state than the table which was dropped. Thus, by making the
DROP ignore the destination table if it's already newer than the event that spawned the DROP,
we can optimize away a bit of re-importing that REPL(CREATE2) would have needed to do. In
the future, we'll add in event-nullification, and can do it at a higher level if we batch
events, but this helps out even when processing at an individual level.

c) In addition to a DROP-IF-OLDER, it also acts like a recursive DROP-TABLE-IF-OLDER for cases
where it doesn't result in the dropping of the table, it will still result in dropping older
partitions in a newer table. For eg., if a T(state=50) has partitions P1(state=45) and P2(state=53),
then DROP_TABLE_IF_OLDER_THAN(47) will drop P1 but not P2. This is because a Drop-table event
does not result in a series of DropPtn events that are associated with the appropriate table.
So, given that our replication works on an per-object basis, if DropTable should not drop
the destination table because the destination table is newer than the origin table at the
time of the drop, it might still contain older partitions which should be nuked. (This mode
is tested in one of the tests in TestCommands in HIVE-10227 if you want to have a look at
an example of what's expected)


Regarding the kewword addition, thanks for the feedback, it was not my intent to make them
"reserved keywords". I talked to [~pxiong] and [~ashutoshc] about it, and the latter is the
way that makes sense. As long as I add them to the nonReserved entry in IdentifiersParser.g,
it should be good. So, I'll add that in and have another update here.

> Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics
> --------------------------------------------------------------------------------------
>                 Key: HIVE-10228
>                 URL:
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Import/Export
>    Affects Versions: 1.2.0
>            Reporter: Sushanth Sowmyan
>            Assignee: Sushanth Sowmyan
>         Attachments: HIVE-10228.2.patch, HIVE-10228.3.patch, HIVE-10228.patch
> We need to update a couple of hive commands to support replication semantics. To wit,
we need the following:
> Export will now support an extra optional clause to tell it that this export is being
prepared for the purpose of replication. There is also an additional optional clause here,
that allows for the export to be a metadata-only export, to handle cases of capturing the
diff for alter statements, for example.
> Also, if done for replication, the non-presence of a table, or a table being a view/offline
table/non-native table is not considered an error, and instead, will result in a successful
> IMPORT ... (as normal) – but handles new semantics 
> No syntax changes for import, but import will have to change to be able to handle all
the permutations of export dumps possible. Also, import will have to ensure that it should
update the object only if the update being imported is not older than the state of the object.
Also, import currently does not work with dbname.tablename kind of specification, this should
be fixed to work.
> Drop Table now has an additional clause, to specify that this drop table is being done
for replication purposes, and that the dop should not actually drop the table if the table
is newer than that event id specified.
> Similarly, Drop Partition also has an equivalent change to Drop Table.
> =
> In addition, we introduce a new property "", which when tagged on to table
properties or partition properties on a replication-destination, holds the effective "state
identifier" of the object.

