hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zheng Shao (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-1131) Add column lineage information to the pre execution hooks
Date Sat, 20 Feb 2010 02:15:28 GMT

    [ https://issues.apache.org/jira/browse/HIVE-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12836103#action_12836103

Zheng Shao commented on HIVE-1131:

S1. Can we make lineage partition-level instead of table-level?
S2. We might want to define formally the concepts of these levels, especially how they are
composited (What will be UDAF of UDF, or UDF of UDAF, like round(sum(col)), or sum(round(col)))
+  /**
+   * Enum to track dependency. This enum has two values:
+   * 1. SCALAR - Indicates that the column is derived from a scalar expression.
+   * 2. AGGREGATION - Indicates that the column is derived from an aggregation.
+   */
+  public static enum DependencyType {
+  }

S3. Use "{}" even for single statement in "if", "for" etc.
S4. Use "ArrayList" instead of "Vector" when it's accessed by a single thread.
S5. Remove "private HashMap<FileSinkOperator, Table> fopToTable;" since it's not used.

> Add column lineage information to the pre execution hooks
> ---------------------------------------------------------
>                 Key: HIVE-1131
>                 URL: https://issues.apache.org/jira/browse/HIVE-1131
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Ashish Thusoo
>            Assignee: Ashish Thusoo
>         Attachments: HIVE-1131.patch
> We need a mechanism to pass the lineage information of the various columns of a table
to a pre execution hook so that applications can use that for:
> - auditing
> - dependency checking
> and many other applications.
> The proposal is to expose this through a bunch of classes to the pre execution hook interface
to the clients and put in the necessary transformation logic in the optimizer to generate
this information.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message