hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zoltan Haindrich <k...@rxd.hu>
Subject Re: Review Request 59408: HIVE-16719 HiveMetaTool fails when the data does not fit in memory
Date Thu, 25 May 2017 23:51:30 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59408/#review176142
-----------------------------------------------------------




metastore/src/java/org/apache/hadoop/hive/metastore/metatool/BlockRetrieverIterable.java
Lines 8 (patched)
<https://reviews.apache.org/r/59408/#comment249480>

    it seems to me that this "iterable" is not a real iterable; as it can't restart the iteration



metastore/src/java/org/apache/hadoop/hive/metastore/metatool/BlockRetrieverIterator.java
Lines 33 (patched)
<https://reviews.apache.org/r/59408/#comment249486>

    If committed is true - but this commitTransaction returns false...
    
    I think committed should stay true in that case



metastore/src/java/org/apache/hadoop/hive/metastore/metatool/BlockRetrieverIterator.java
Lines 42 (patched)
<https://reviews.apache.org/r/59408/#comment249494>

    nextValue belongs to the 'current' transaction; I think this code may probably miss the
last entry's changes in every transactional block



metastore/src/java/org/apache/hadoop/hive/metastore/metatool/BlockRetrieverIterator.java
Lines 53 (patched)
<https://reviews.apache.org/r/59408/#comment249490>

    I'm not sure...but I've seen some silent mode somewhere in the options...



metastore/src/java/org/apache/hadoop/hive/metastore/metatool/EntityUpdater.java
Lines 1 (patched)
<https://reviews.apache.org/r/59408/#comment249497>

    could you add asf headers to the new files?



metastore/src/java/org/apache/hadoop/hive/metastore/metatool/EntityUpdater.java
Lines 16 (patched)
<https://reviews.apache.org/r/59408/#comment249488>

    RetrieverIterable's iterablity is only used here for a 'for' loop...I think the interface
and the iterable just makes it a bit more trickier...standard hasNext() would be simpler



metastore/src/java/org/apache/hadoop/hive/metastore/metatool/EntityUpdater.java
Lines 25 (patched)
<https://reviews.apache.org/r/59408/#comment249483>

    if an error have happend..an exception is going out at this point...why should it be marked
on the locationupdater? I mean...the exception already describes the situation
    
    It seems like errorHappened only controls a log message - or I might have missed something



metastore/src/test/org/apache/hadoop/hive/metastore/metatool/EntityUpdaterTest.java
Lines 1 (patched)
<https://reviews.apache.org/r/59408/#comment249482>

    I think tests ending with 'Test' are also executed...however there is a note in the contribution
guide about starting the name with Test  :)
    https://cwiki.apache.org/confluence/display/Hive/HowToContribute#HowToContribute-JavaUnitTest



metastore/src/test/org/apache/hadoop/hive/metastore/tools/HiveMetaToolTest.java
Lines 124 (patched)
<https://reviews.apache.org/r/59408/#comment249481>

    there might be an alternative to this check by using Set<String>'s instead of the
List...and use assertEquals


I'm not sure...but it might probably be less trickier to try with a dataProvider.runVisit(myLocationUpdater)
pattern, this might be probably fit as objectStore.run??Visit() or not...and by doing it that
way you may probably be able control the dry-run behaviour by just not committing the changes
in the visit runner...

I think it would be ok to run a dry-run prior to executing the real thing...I know it's double
work; but it may probably reduce the chance of data corruption..

- Zoltan Haindrich


On May 19, 2017, 5:05 p.m., Zsolt Fekete wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59408/
> -----------------------------------------------------------
> 
> (Updated May 19, 2017, 5:05 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Currently HiveMetaTool reads full tables (as DataNucleus entities) into memory by calling
PersistenceManager's retrieveAll().
> 
> See these methods of ObjectStore: updateMDatabaseURI, updateTblPropURI, updateMStorageDescriptorTblPropURI,
updateMStorageDescriptorTblURI, updateSerdeURI.
> 
> This might cause failure when the affected tables (SDS, DBS, TABLE_PARAMS, SD_PARAMS,
SERDES, SERDE_PARAMS) are too big.
> 
> 
> Diffs
> -----
> 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java b28983f 
>   metastore/src/java/org/apache/hadoop/hive/metastore/metatool/BlockRetrieverIterable.java
PRE-CREATION 
>   metastore/src/java/org/apache/hadoop/hive/metastore/metatool/BlockRetrieverIterator.java
PRE-CREATION 
>   metastore/src/java/org/apache/hadoop/hive/metastore/metatool/EntityUpdater.java PRE-CREATION

>   metastore/src/java/org/apache/hadoop/hive/metastore/metatool/IDataProvider.java PRE-CREATION

>   metastore/src/java/org/apache/hadoop/hive/metastore/metatool/LocationEntity.java PRE-CREATION

>   metastore/src/java/org/apache/hadoop/hive/metastore/metatool/LocationEntityImplementations.java
PRE-CREATION 
>   metastore/src/java/org/apache/hadoop/hive/metastore/metatool/LocationUpdater.java PRE-CREATION

>   metastore/src/java/org/apache/hadoop/hive/metastore/metatool/RetrieverIterable.java
PRE-CREATION 
>   metastore/src/java/org/apache/hadoop/hive/metastore/metatool/ReturnValue.java PRE-CREATION

>   metastore/src/java/org/apache/hadoop/hive/metastore/metatool/UpdateParams.java PRE-CREATION

>   metastore/src/java/org/apache/hadoop/hive/metastore/metatool/UriUpdateChecker.java
PRE-CREATION 
>   metastore/src/java/org/apache/hadoop/hive/metastore/tools/HiveMetaTool.java 22e246f

>   metastore/src/test/org/apache/hadoop/hive/metastore/metatool/BlockRetrieverIterableTest.java
PRE-CREATION 
>   metastore/src/test/org/apache/hadoop/hive/metastore/metatool/DataProviderStub.java
PRE-CREATION 
>   metastore/src/test/org/apache/hadoop/hive/metastore/metatool/EntityUpdaterTest.java
PRE-CREATION 
>   metastore/src/test/org/apache/hadoop/hive/metastore/metatool/LocationEntityImplementationsTest.java
PRE-CREATION 
>   metastore/src/test/org/apache/hadoop/hive/metastore/metatool/LocationUpdaterTest.java
PRE-CREATION 
>   metastore/src/test/org/apache/hadoop/hive/metastore/metatool/ReturnValueTest.java PRE-CREATION

>   metastore/src/test/org/apache/hadoop/hive/metastore/metatool/UriUpdateCheckerTest.java
PRE-CREATION 
>   metastore/src/test/org/apache/hadoop/hive/metastore/tools/HiveMetaToolTest.java PRE-CREATION

> 
> 
> Diff: https://reviews.apache.org/r/59408/diff/2/
> 
> 
> Testing
> -------
> 
> The new tests passed:
> mvn test -Dtest=BlockRetrieverIterableTest,EntityUpdaterTest,UriUpdateCheckerTest,LocationUpdaterTest,ReturnValueTest,LocationEntityImplementationsTest,IntegrationTest
> 
> 
> Thanks,
> 
> Zsolt Fekete
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message