hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Carl Steinbach" <c...@cloudera.com>
Subject Re: Review Request: HIVE-3056: Ability to bulk update location field in Db/Table/Partition records
Date Sun, 26 Aug 2012 03:48:31 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6650/#review10764
-----------------------------------------------------------



conf/hive-default.xml.template
<https://reviews.apache.org/r/6650/#comment23202>

    This comment has fallen out of sync with the value given. Please back this change out
since it is no longer relevant to this patch.



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
<https://reviews.apache.org/r/6650/#comment23204>

    This comment should be repeated for each of the new methods added to ObjectStore.



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
<https://reviews.apache.org/r/6650/#comment23203>

    Please replace the println calls with LOG.info() calls. Generally speaking the only code
that should use System.*.println to print directly to stdout/stderr is stuff that's located
in an application's main() method, or in classes that contain a main() method.
    
    Also, please change the sentence to use the active tense, i.e. "Executing query: ..."



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
<https://reviews.apache.org/r/6650/#comment23205>

    Seems like this stuff should only get printed out if the transaction succeeded.



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
<https://reviews.apache.org/r/6650/#comment23207>

    Formatting: please remove these blank lines.



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
<https://reviews.apache.org/r/6650/#comment23206>

    Same deal here. I know that commitTransaction is probably kind of broken, but we should
only print this out if committed == true. Also, each of these methods should probably either
return a boolean or throw an exception in order to indicate that transaction succeeded or
failed.



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
<https://reviews.apache.org/r/6650/#comment23209>

    Doesn't look the 'committed' boolean variable is used. Also, this function should probably
return a List<String> of hdfsRoot strings instead of printing them out.



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
<https://reviews.apache.org/r/6650/#comment23208>

    Formatting: '{' goes on the same line as the method parameter list.



metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
<https://reviews.apache.org/r/6650/#comment23210>

    Please use a foreach loop instead of an explicit iterator. Same comment applies to the
other methods when possible.



metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java
<https://reviews.apache.org/r/6650/#comment23211>

    s/data nucleus/DataNucleus/



metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java
<https://reviews.apache.org/r/6650/#comment23216>

    s/location/locations/
    
    Also, it would be good to eliminate duplicate entries from the output.



metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java
<https://reviews.apache.org/r/6650/#comment23221>

    "Update HDFS root location..." instead of "Updates HDFS root location..."



metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java
<https://reviews.apache.org/r/6650/#comment23219>

    Please change "HAUpgrade" to something that isn't HA specific. Also, all variable names
(with the exception of final statics) should adhere to the camel-case naming convention, including
starting with a lowercase letter (which I admit is a bit awkward in this case).



metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java
<https://reviews.apache.org/r/6650/#comment23220>

    I'm a little worried that this operation is too generic, and that as a consequence we're
probably missing some errors. Every location value stored in MDatabase and MStorageDescriptor
should be a valid o.a.h.fs.Path object (http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/fs/Path),
which means that it should also be a valid URI (http://docs.oracle.com/javase/6/docs/api/java/net/URI.html?is-external=true).
java.net.URI allows you to pick out individual components of the URI, and in this case I think
we're mainly interested in updating the "host" (and possibly the "port") part of the URI.
So I think metatool should give you option of specifying a scheme, host, and port to match
(although only the host is required), as well as replacement values for the those three parts,
although only the host is required. Another thing to point out is that all three of these
things should be case-insensitive (which can't be said for "path" component of the URI), so
everything should get converted to lowercase in ObjectStore before the matching logic is applied.
    
    I think what we want is something like the following:
    
    metatool --updateLocation [--oldScheme=hdfs] --oldHost=127.0.0.1 [--oldPort=xyz] --newHost=yoyodyne
[--newPort=333] etc.
    
    Also, can you please add a --dryRun option that dumps logs the changes that would be made
to stdout but doesn't actually make any of the changes? This way people can first experiment
with the options without making any big mistakes.
    
    Finally, it's possible to run Hive on top of other filesystems besides HDFS (S3 is one
example), so it's probably best to avoid using the term "HDFS" in this code.



metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java
<https://reviews.apache.org/r/6650/#comment23218>

    Please remove.



metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java
<https://reviews.apache.org/r/6650/#comment23213>

    main() should not throw any checked exceptions. Please catch them and print appropriate
error messages.



metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java
<https://reviews.apache.org/r/6650/#comment23212>

    I don't think moving this logic to run() really accomplishes anything. Please move it
back into main().



metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java
<https://reviews.apache.org/r/6650/#comment23223>

    It looks like run() has two ways of signaling an error, either by returning a non-zero
value or by throwing an exception. Exceptions are semantically much richer than return codes
(e.g. you can wrap return codes in exceptions if you really want to) and are the right way
of handling error conditions in Java. Generally speaking the only method that should use an
error return code is main().



metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java
<https://reviews.apache.org/r/6650/#comment23217>

    Options should start with a lowercase letter, e.g. "listHDFSRoots", "executeJDOQL", etc.



metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java
<https://reviews.apache.org/r/6650/#comment23215>

    Formatting: else should go on the same line as '}'



metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java
<https://reviews.apache.org/r/6650/#comment23214>

    A couple things:
    
    * Throwing java.lang.Exception is never the right thing to do. If you have to throw an
exception you need to throw a subclass of java.lang.Exception, or the appropriate runtime
exception if it's not something that you expect another caller further down the stack will
be able to handle.
    * If the command line args are incorrect or unrecognizable it's considerably more polite
to echo the unrecognizable part to stderr and then print the help screen. Right now when I
run metatool without an arguments I get an exception. It should print the help info instead.



metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java
<https://reviews.apache.org/r/6650/#comment23222>

    Please use e.getLocalizedMessage() instead. Also, it's generally a good idea to put the
catch() block as close to the thrower as possible. Wrapping parser.parse() with it's own try/catch
block would probably be cleaner.


- Carl Steinbach


On Aug. 25, 2012, 9 p.m., Shreepadma Venugopalan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/6650/
> -----------------------------------------------------------
> 
> (Updated Aug. 25, 2012, 9 p.m.)
> 
> 
> Review request for hive and Carl Steinbach.
> 
> 
> Description
> -------
> 
> This patch implement hive metatool which,
> 
> * lets admins perform a HA upgrade by patching the location of the NN in Hive's metastore
> * allows JDOQL to be executed against the metastore.
> 
> 
> This addresses bug HIVE-3056.
>     https://issues.apache.org/jira/browse/HIVE-3056
> 
> 
> Diffs
> -----
> 
>   bin/ext/metatool.sh PRE-CREATION 
>   bin/metatool PRE-CREATION 
>   build.xml 6712af9 
>   conf/hive-default.xml.template cc1565f 
>   eclipse-templates/TestHiveMetaTool.launchtemplate PRE-CREATION 
>   metastore/ivy.xml 3011d2f 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 045b550 
>   metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java PRE-CREATION

>   metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaTool.java PRE-CREATION

> 
> Diff: https://reviews.apache.org/r/6650/diff/
> 
> 
> Testing
> -------
> 
> A new JUnit test - TestHiveMetaTool - has been added to test the various metatool options.
> 
> 
> Thanks,
> 
> Shreepadma Venugopalan
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message