hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Sichi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-1096) Hive Variables
Date Mon, 25 Jan 2010 20:13:34 GMT

    [ https://issues.apache.org/jira/browse/HIVE-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12804685#action_12804685

John Sichi commented on HIVE-1096:

I'm fine with it if we start with the implementation you are proposing, document the limitation
with respect to views, and decide if more is needed based on user feedback.

I'm guessing that once people start using the two features together, they might end up wanting
the variables to be expanded as part of evaluating a view, otherwise they may be forced to
move ETL logic which belongs inside a reusable view out into the calling SQL statements instead
where the expansion happens.  This would limit the utility of views.

I agree that treating variables as first-class objects is a heavyweight change.

There is one possible middle ground approach, which is to avoid introducing variables as first-class
objects, but still try to expand them in views by passing the substitution map down into SemanticAnalyzer
and letting it apply the substitutions as part of reparsing the view's definition obtained
from the catalog.

This would imply

(a) CREATE VIEW would need to be able to undo the substitutions so that the stored view definition
in the catalog would contain variable references instead of replacements.

(b) If a variable was defined when the view was created, but undefined when referenced later,
we'd need to substitute NULL or raise an exception.  Probably best to be consistent with whatever
you're planning for the top-level substitutions in this case.  I made the view reparse error
handling verbose so that a user should have a chance of figuring out what happened in this
case, although it will be about as pretty as a C++ template compilation failure.

> Hive Variables
> --------------
>                 Key: HIVE-1096
>                 URL: https://issues.apache.org/jira/browse/HIVE-1096
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Edward Capriolo
>            Assignee: Edward Capriolo
> From mailing list:
> --Amazon Elastic MapReduce version of Hive seems to have a nice feature called "Variables."
Basically you can define a variable via command-line while invoking hive with -d DT=2009-12-09
and then refer to the variable via ${DT} within the hive queries. This could be extremely
useful. I can't seem to find this feature even on trunk. Is this feature currently anywhere
in the roadmap?--
> This could be implemented in many places.
> A simple place to put this is 
> in Driver.compile or Driver.run we can do string substitutions at that level, and further
downstream need not be effected. 
> There could be some benefits to doing this further downstream, parser,plan. but based
on the simple needs we may not need to overthink this.
> I will get started on implementing in compile unless someone wants to discuss this more.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message