hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zheng Shao (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-788) Triggers when a new partition is created for a table
Date Tue, 25 Aug 2009 21:31:59 GMT

    [ https://issues.apache.org/jira/browse/HIVE-788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12747660#action_12747660

Zheng Shao commented on HIVE-788:

Edward, yes, by A I mean the user will manage and run a script himself, and the script will
call a Hive command which waits for a new partition to appear.

Talked offline with Ashish about the 3 questions of B. 
1. We will support shell command for now. The reason is that users can hook the trigger up
with some other existing job/process management tool to monitor the status of the triggered
2. The MoveTask (which calls db.loadTable and db.loadPartition) will be running the shell
command on the same machine that loads the load/partition. (Since there may not be a HiveServer
3.  If the shell command failed, the move task will return failure (while the new table/partition
is already created / data updated in case of overwrite). This is also a simple choice because
we don't have the concept of transactions/roll back yet.

So it seems B will be a better way to go.

The next question would be, what are the types of trigger we want to support now:
1. On new partition creation in a specified table
2. On data change (overwrite/append) in a specified table (or any partitions of a specified

There might be more but it seems these two are highly wanted.

> Triggers when a new partition is created for a table
> ----------------------------------------------------
>                 Key: HIVE-788
>                 URL: https://issues.apache.org/jira/browse/HIVE-788
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Zheng Shao
> One requirement for HIVE-787 is that users would like to run a command whenever a new
partition of a Hive table gets created.
> There are several ways to achieve this functionality:
> A. Probe and wait: We can have the scripts running in a loop checking if a new partition
is created.
>   Pros: easy to write, easy to control
>   Cons: will introduce another delay based on the probing interval.
> B. Triggered: The command is registered inside the hive metastore. Whenever a partition
gets created, we run the registered command. 
> Several questions around option B are:
> 1. whether to support registration of HiveQL or shell command;
> 2. which machine/environment to run the command;
> 3. what to do if the registered command failed.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message