hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "HBase Review Board (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-1096) Hive Variables
Date Tue, 29 Jun 2010 14:28:51 GMT

    [ https://issues.apache.org/jira/browse/HIVE-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883561#action_12883561
] 

HBase Review Board commented on HIVE-1096:
------------------------------------------

Message from: "Edward Capriolo" <edlinuxguru@gmail.com>


bq.  On 2010-06-29 00:50:22, Carl Steinbach wrote:
bq.  > trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 260
bq.  > <http://review.hbase.org/r/229/diff/1/?file=1601#file1601line260>
bq.  >
bq.  >     I like the word "interpolate" too, but I think more people are probably familiar
with "substitute". Please change the name to HIVESUBSTITUTEVARIABLES.

I do not want to have the feature called different things across the code base. Replace here
interpolate there it will be confusing for all. You originally suggested interpolate: "Driver.interpolateCommandVariables()".


bq.  On 2010-06-29 00:50:22, Carl Steinbach wrote:
bq.  > trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 561
bq.  > <http://review.hbase.org/r/229/diff/1/?file=1601#file1601line561>
bq.  >
bq.  >     No need to test a boolean valued method for equality to true. The logic can
also be simplified as follows:
bq.  >     
bq.  >     if (!getBoolVar(ConfVars.HIVEVARIABLEINTERPOLATE)) {
bq.  >       return expr;
bq.  >     }
bq.  >     l4j.info("Interpolation is on");
bq.  >     ...
bq.  >     
bq.  >     Also, please move this out of the substitution function and into the Driver,
i.e Driver.compile() calls VariableSubstitution.substitute() iff HIVESUBSTITUEVARIABLES ==
true.

Different components are using substitute SetProcessor, Driver , File Processor, having the
interpolation on/off logic in each class is redundant. This way is better as it supports information
hiding.


bq.  On 2010-06-29 00:50:22, Carl Steinbach wrote:
bq.  > trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 562
bq.  > <http://review.hbase.org/r/229/diff/1/?file=1601#file1601line562>
bq.  >
bq.  >     This log message is going to generate a lot of noise, and is unnecessary since
you can determine the value using the 'set' command. Please remove it.

I think it is very important to see the command before and after substitution. We can set
this to debug, I do not think it is noise.


bq.  On 2010-06-29 00:50:22, Carl Steinbach wrote:
bq.  > trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 587
bq.  > <http://review.hbase.org/r/229/diff/1/?file=1601#file1601line587>
bq.  >
bq.  >     I thought the plan was to also introduce a "conf:" prefix for referencing configuration
properties, and that any value not prefixed by system/env/conf would map to a new variable
namespace? Please add the conf prefix and introduce some state in the SessionState for storing
variables (i.e. non System/Env/Conf properties).

After pondering this I think the "conf:" prefix is confusing and hurts backwards compatibility.
Right now this is "hive -hiveconf x=y"  = "set x=y". I do not think we want to introduce another
switch. "--hivevarconf". in most cases users want conf access to set properties that can be
picked up by classes. hadoop & hive conf is the way it is adding something else will not
fix the problem and will confuse people.


bq.  On 2010-06-29 00:50:22, Carl Steinbach wrote:
bq.  > trunk/ql/src/test/queries/clientpositive/set_processor_namespaces.q, line 4
bq.  > <http://review.hbase.org/r/229/diff/1/?file=1605#file1605line4>
bq.  >
bq.  >     You need a test for the "env" namespace. I think this is impossible to do here,
so you probably need to add a unit test.

Env variables are not changeable and are system dependent. I do not think there is a way to
test these.


bq.  On 2010-06-29 00:50:22, Carl Steinbach wrote:
bq.  > trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 595
bq.  > <http://review.hbase.org/r/229/diff/1/?file=1601#file1601line595>
bq.  >
bq.  >     Please remove this log call. It will generate a lot of noise.

I think it is very important to see the command before and after substitution. We can set
this to debug, I do not think it is noise.


bq.  On 2010-06-29 00:50:22, Carl Steinbach wrote:
bq.  > trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 50
bq.  > <http://review.hbase.org/r/229/diff/1/?file=1601#file1601line50>
bq.  >
bq.  >     The variables varPat and MAX_SUBST both exist as private variables in the parent
class. I think that redefining them in HiveConf will result in lots of confusion down the
road, plus they don't really belong here. Please move this to a "VariableSubstitution" class
located in ql.

I see your point. Then again, hadoop does the substitution inside the conf class. Also QL
is becoming the 'ubber package' At this point what isnt ql?


bq.  On 2010-06-29 00:50:22, Carl Steinbach wrote:
bq.  > trunk/ql/src/java/org/apache/hadoop/hive/ql/processors/SetProcessor.java, line 93
bq.  > <http://review.hbase.org/r/229/diff/1/?file=1604#file1604line93>
bq.  >
bq.  >     I don't think you want to perform variable substitution at this point. It makes
it impossible to create nested variables.

I will check it out.


- Edward


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/229/#review288
-----------------------------------------------------------





> Hive Variables
> --------------
>
>                 Key: HIVE-1096
>                 URL: https://issues.apache.org/jira/browse/HIVE-1096
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Edward Capriolo
>            Assignee: Edward Capriolo
>             Fix For: 0.6.0, 0.7.0
>
>         Attachments: 1096-9.diff, hive-1096-10-patch.txt, hive-1096-11-patch.txt, hive-1096-2.diff,
hive-1096-7.diff, hive-1096-8.diff, hive-1096.diff
>
>
> From mailing list:
> --Amazon Elastic MapReduce version of Hive seems to have a nice feature called "Variables."
Basically you can define a variable via command-line while invoking hive with -d DT=2009-12-09
and then refer to the variable via ${DT} within the hive queries. This could be extremely
useful. I can't seem to find this feature even on trunk. Is this feature currently anywhere
in the roadmap?--
> This could be implemented in many places.
> A simple place to put this is 
> in Driver.compile or Driver.run we can do string substitutions at that level, and further
downstream need not be effected. 
> There could be some benefits to doing this further downstream, parser,plan. but based
on the simple needs we may not need to overthink this.
> I will get started on implementing in compile unless someone wants to discuss this more.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message