hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashutosh Chauhan <hashut...@apache.org>
Subject Re: Review Request 61009: Extend object store to store bit vectors
Date Sun, 23 Jul 2017 16:58:42 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61009/#review181182
-----------------------------------------------------------




common/src/java/org/apache/hadoop/hive/common/ndv/FMSketch.java
Line 175 (original), 162 (patched)
<https://reviews.apache.org/r/61009/#comment256635>

    Need to close this bos.



common/src/java/org/apache/hadoop/hive/common/ndv/FMSketch.java
Lines 171 (patched)
<https://reviews.apache.org/r/61009/#comment256637>

    ws



common/src/java/org/apache/hadoop/hive/common/ndv/FMSketch.java
Line 233 (original), 183 (patched)
<https://reviews.apache.org/r/61009/#comment256638>

    ws



common/src/java/org/apache/hadoop/hive/common/ndv/fm/FMSketchUtils.java
Lines 82 (patched)
<https://reviews.apache.org/r/61009/#comment256639>

    Need to close is.



common/src/java/org/apache/hadoop/hive/common/ndv/fm/FMSketchUtils.java
Lines 94 (patched)
<https://reviews.apache.org/r/61009/#comment256640>

    ws



common/src/java/org/apache/hadoop/hive/common/ndv/fm/FMSketchUtils.java
Lines 98 (patched)
<https://reviews.apache.org/r/61009/#comment256641>

    ws



common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
Lines 1733 (patched)
<https://reviews.apache.org/r/61009/#comment256636>

    Better name: hive.stats.bitvector.fetch ?



metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
Line 1390 (original), 1393 (patched)
<https://reviews.apache.org/r/61009/#comment256644>

    CachedStore uses this fuction to load initial column stats and this still doesn't fetch
bit vectors (if enabled in config).



metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
Lines 1846 (patched)
<https://reviews.apache.org/r/61009/#comment256642>

    This cast of null is not necessary.



metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java
Lines 1949 (patched)
<https://reviews.apache.org/r/61009/#comment256643>

    ws



metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java
Lines 1970 (patched)
<https://reviews.apache.org/r/61009/#comment256645>

    ws



metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java
Lines 1973 (patched)
<https://reviews.apache.org/r/61009/#comment256646>

    ws



metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/aggr/DateColumnStatsAggregator.java
Lines 53 (patched)
<https://reviews.apache.org/r/61009/#comment256647>

    Log.debug



metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/aggr/DateColumnStatsAggregator.java
Lines 92 (patched)
<https://reviews.apache.org/r/61009/#comment256648>

    Log.debug



metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/aggr/DateColumnStatsAggregator.java
Lines 123 (patched)
<https://reviews.apache.org/r/61009/#comment256650>

    LOG.debug (Ndv estimatation using bitvector + column name)



metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/aggr/DateColumnStatsAggregator.java
Lines 145 (patched)
<https://reviews.apache.org/r/61009/#comment256649>

    Log.debug



metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/aggr/StringColumnStatsAggregator.java
Lines 53 (patched)
<https://reviews.apache.org/r/61009/#comment256663>

    LOG.debug



metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/aggr/StringColumnStatsAggregator.java
Lines 91 (patched)
<https://reviews.apache.org/r/61009/#comment256664>

    LOG.debug



metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/aggr/StringColumnStatsAggregator.java
Lines 117 (patched)
<https://reviews.apache.org/r/61009/#comment256665>

    LOG.debug (Ndv estimatation using bitvector + column name + ndvvalue)



metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/aggr/StringColumnStatsAggregator.java
Lines 125 (patched)
<https://reviews.apache.org/r/61009/#comment256666>

    LOG.debug



metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/DecimalColumnStatsAggregator.java
Lines 54 (patched)
<https://reviews.apache.org/r/61009/#comment256651>

    Log.debug.



metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/DecimalColumnStatsAggregator.java
Lines 93 (patched)
<https://reviews.apache.org/r/61009/#comment256652>

    Log.debug



metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/DecimalColumnStatsAggregator.java
Line 129 (original), 133 (patched)
<https://reviews.apache.org/r/61009/#comment256653>

    LOG.debug (Ndv estimatation using bitvector + column name)



metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/DecimalColumnStatsAggregator.java
Lines 156 (patched)
<https://reviews.apache.org/r/61009/#comment256654>

    LOG.debug



metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/DoubleColumnStatsAggregator.java
Lines 52 (patched)
<https://reviews.apache.org/r/61009/#comment256655>

    LOG.debug



metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/DoubleColumnStatsAggregator.java
Lines 91 (patched)
<https://reviews.apache.org/r/61009/#comment256656>

    LOG.debug



metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/DoubleColumnStatsAggregator.java
Line 117 (original), 121 (patched)
<https://reviews.apache.org/r/61009/#comment256658>

    LOG.debug (Ndv estimatation using bitvector + column name + ndvvalue)



metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/DoubleColumnStatsAggregator.java
Lines 143 (patched)
<https://reviews.apache.org/r/61009/#comment256657>

    LOG.debug



metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/LongColumnStatsAggregator.java
Lines 52 (patched)
<https://reviews.apache.org/r/61009/#comment256659>

    LOG.debug



metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/LongColumnStatsAggregator.java
Lines 91 (patched)
<https://reviews.apache.org/r/61009/#comment256660>

    LOG.debug



metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/LongColumnStatsAggregator.java
Line 117 (original), 121 (patched)
<https://reviews.apache.org/r/61009/#comment256661>

    LOG.debug (Ndv estimatation using bitvector + column name + ndvvalue)



metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/LongColumnStatsAggregator.java
Lines 143 (patched)
<https://reviews.apache.org/r/61009/#comment256662>

    LOG.debug



metastore/src/model/org/apache/hadoop/hive/metastore/model/MPartitionColumnStatistics.java
Lines 51 (patched)
<https://reviews.apache.org/r/61009/#comment256667>

    I think we can declare this as byte[]. And than callers of this can make sure they do
from string<->byte[]. This way rest of system still uses String, but for storage in
RDBMS (via this class) is of byte[] and than declare varbinary type in sql script.



metastore/src/model/org/apache/hadoop/hive/metastore/model/MPartitionColumnStatistics.java
Lines 265 (patched)
<https://reviews.apache.org/r/61009/#comment256668>

    This will than return byte[]. And caller will do byte[]->string.



metastore/src/model/org/apache/hadoop/hive/metastore/model/MPartitionColumnStatistics.java
Lines 269 (patched)
<https://reviews.apache.org/r/61009/#comment256669>

    Caller need to do string->byte[] before calling this.



metastore/src/model/org/apache/hadoop/hive/metastore/model/MTableColumnStatistics.java
Lines 49 (patched)
<https://reviews.apache.org/r/61009/#comment256670>

    I think we can declare this as byte[]. And than callers of this can make sure they do
from string<->byte[]. This way rest of system still uses String, but for storage in
RDBMS (via this class) is of byte[] and than declare varbinary type in sql script.



ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatUtils.java
Lines 789 (patched)
<https://reviews.apache.org/r/61009/#comment256672>

    Lets put this as the last column before comment. Having it in middle distorts desc formatted
output.



ql/src/java/org/apache/hadoop/hive/ql/plan/DescTableDesc.java
Line 62 (original), 62 (patched)
<https://reviews.apache.org/r/61009/#comment256671>

    Lets put this as the last column before comment. Having it in middle distorts desc formatted
output.


- Ashutosh Chauhan


On July 22, 2017, 9:19 p.m., pengcheng xiong wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61009/
> -----------------------------------------------------------
> 
> (Updated July 22, 2017, 9:19 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-16997
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/common/ndv/FMSketch.java e20d29954a 
>   common/src/java/org/apache/hadoop/hive/common/ndv/NumDistinctValueEstimatorFactory.java
e810ac5487 
>   common/src/java/org/apache/hadoop/hive/common/ndv/fm/FMSketchUtils.java PRE-CREATION

>   common/src/java/org/apache/hadoop/hive/common/ndv/hll/HyperLogLog.java d1955468a6 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java df45f2cc32 
>   common/src/test/org/apache/hadoop/hive/common/ndv/fm/TestFMSketchSerialization.java
PRE-CREATION 
>   metastore/scripts/upgrade/derby/044-HIVE-16997.derby.sql PRE-CREATION 
>   metastore/scripts/upgrade/derby/hive-schema-3.0.0.derby.sql a9a532906f 
>   metastore/scripts/upgrade/derby/upgrade-2.3.0-to-3.0.0.derby.sql 30513dc882 
>   metastore/scripts/upgrade/mssql/029-HIVE-16997.mssql.sql PRE-CREATION 
>   metastore/scripts/upgrade/mssql/hive-schema-3.0.0.mssql.sql 1cfe2d1b2d 
>   metastore/scripts/upgrade/mssql/upgrade-2.3.0-to-3.0.0.mssql.sql 5683254b04 
>   metastore/scripts/upgrade/mysql/044-HIVE-16997.mysql.sql PRE-CREATION 
>   metastore/scripts/upgrade/mysql/hive-schema-3.0.0.mysql.sql 97d881f263 
>   metastore/scripts/upgrade/mysql/upgrade-2.3.0-to-3.0.0.mysql.sql ba62939809 
>   metastore/scripts/upgrade/oracle/044-HIVE-16997.oracle.sql PRE-CREATION 
>   metastore/scripts/upgrade/oracle/hive-schema-3.0.0.oracle.sql 8fdb552367 
>   metastore/scripts/upgrade/oracle/upgrade-2.3.0-to-3.0.0.oracle.sql 0a70d47cca 
>   metastore/scripts/upgrade/postgres/043-HIVE-16997.postgres.sql PRE-CREATION 
>   metastore/scripts/upgrade/postgres/hive-schema-3.0.0.postgres.sql 1cdeb6b45a 
>   metastore/scripts/upgrade/postgres/upgrade-2.3.0-to-3.0.0.postgres.sql c44dd067fc 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java a960b2d26b

>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java b52c94c9fb

>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java db4ec91cdb 
>   metastore/src/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java 2dc2804343

>   metastore/src/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java 3ac4fe1604

>   metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/aggr/DateColumnStatsAggregator.java
PRE-CREATION 
>   metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/aggr/StringColumnStatsAggregator.java
PRE-CREATION 
>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/StatsCache.java 0e119896a5

>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/BinaryColumnStatsAggregator.java
d81d612e92 
>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/BooleanColumnStatsAggregator.java
e796df2422 
>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/ColumnStatsAggregator.java
29a05390bf 
>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/ColumnStatsAggregatorFactory.java
568bf0609b 
>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/DecimalColumnStatsAggregator.java
8eb64e0143 
>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/DoubleColumnStatsAggregator.java
b6b86123b2 
>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/IExtrapolatePartStatus.java
af75bced72 
>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/LongColumnStatsAggregator.java
2da6f60167 
>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/StringColumnStatsAggregator.java
83c6c54fd2 
>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/merge/BinaryColumnStatsMerger.java
af0669eb65 
>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/merge/BooleanColumnStatsMerger.java
33ff6a19f5 
>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/merge/ColumnStatsMerger.java
d3051a2b00 
>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/merge/ColumnStatsMergerFactory.java
c013ba5c5d 
>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/merge/DateColumnStatsMerger.java
e899bfe85f 
>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/merge/DecimalColumnStatsMerger.java
4099ffcace 
>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/merge/DoubleColumnStatsMerger.java
1691fc97df 
>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/merge/LongColumnStatsMerger.java
361af350fe 
>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/stats/merge/StringColumnStatsMerger.java
8e28f907ee 
>   metastore/src/model/org/apache/hadoop/hive/metastore/model/MPartitionColumnStatistics.java
2967a60fae 
>   metastore/src/model/org/apache/hadoop/hive/metastore/model/MTableColumnStatistics.java
132f7a137b 
>   metastore/src/model/package.jdo 9c4bc219f2 
>   metastore/src/test/org/apache/hadoop/hive/metastore/TestOldSchema.java PRE-CREATION

>   metastore/src/test/org/apache/hadoop/hive/metastore/cache/TestCachedStore.java 1fa9447145

>   metastore/src/test/org/apache/hadoop/hive/metastore/hbase/TestHBaseAggregateStatsCacheWithBitVector.java
ecc99c3300 
>   metastore/src/test/org/apache/hadoop/hive/metastore/hbase/TestHBaseAggregateStatsExtrapolation.java
99ce96ca0d 
>   metastore/src/test/org/apache/hadoop/hive/metastore/hbase/TestHBaseAggregateStatsNDVUniformDist.java
74e16695a9 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 97bf839ae1 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatUtils.java
aa77234c28 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ColStatistics.java 41a1c7a582 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/DescTableDesc.java d7a9888389 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java 2d56950cb1

>   ql/src/test/queries/clientpositive/bitvector.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/fm-sketch.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/hll.q edfdce8a29 
>   ql/src/test/results/clientpositive/alterColumnStats.q.out 519a62a190 
>   ql/src/test/results/clientpositive/alterColumnStatsPart.q.out 672bd9f4bb 
>   ql/src/test/results/clientpositive/alter_partition_update_status.q.out c0d4eeefb4 
>   ql/src/test/results/clientpositive/alter_table_column_stats.q.out 96dce1e2c5 
>   ql/src/test/results/clientpositive/alter_table_update_status.q.out 9cd9a8dbe0 
>   ql/src/test/results/clientpositive/analyze_tbl_part.q.out 6a3fbc0cc7 
>   ql/src/test/results/clientpositive/autoColumnStats_5.q.out e3abba5bd0 
>   ql/src/test/results/clientpositive/autoColumnStats_9.q.out 06f23b1e7c 
>   ql/src/test/results/clientpositive/avro_decimal.q.out e1045ebea1 
>   ql/src/test/results/clientpositive/avro_decimal_native.q.out b73b5f5679 
>   ql/src/test/results/clientpositive/bitvector.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/char_udf1.q.out fefc7407e0 
>   ql/src/test/results/clientpositive/colstats_all_nulls.q.out 0f2822504f 
>   ql/src/test/results/clientpositive/column_names_with_leading_and_trailing_spaces.q.out
fb833bccb2 
>   ql/src/test/results/clientpositive/column_pruner_multiple_children.q.out 9925928da7

>   ql/src/test/results/clientpositive/columnstats_partlvl.q.out 5ecb20501b 
>   ql/src/test/results/clientpositive/columnstats_partlvl_dp.q.out a64c76badf 
>   ql/src/test/results/clientpositive/columnstats_tbllvl.q.out 91c8f150a2 
>   ql/src/test/results/clientpositive/compustat_avro.q.out 2f8dc10e50 
>   ql/src/test/results/clientpositive/compute_stats_date.q.out 5cd2180108 
>   ql/src/test/results/clientpositive/compute_stats_decimal.q.out fcfce78b82 
>   ql/src/test/results/clientpositive/compute_stats_double.q.out e6a087dd98 
>   ql/src/test/results/clientpositive/compute_stats_long.q.out fb985d8266 
>   ql/src/test/results/clientpositive/compute_stats_string.q.out a5d66eba31 
>   ql/src/test/results/clientpositive/confirm_initial_tbl_stats.q.out 5593e422b6 
>   ql/src/test/results/clientpositive/decimal_stats.q.out f58a7cc8e1 
>   ql/src/test/results/clientpositive/deleteAnalyze.q.out 1bae859e2c 
>   ql/src/test/results/clientpositive/describe_syntax.q.out 19147a1d92 
>   ql/src/test/results/clientpositive/describe_table.q.out 3ba9a7b942 
>   ql/src/test/results/clientpositive/display_colstats_tbllvl.q.out 73d4cd7660 
>   ql/src/test/results/clientpositive/encrypted/encryption_move_tbl.q.out 1096e9fc64 
>   ql/src/test/results/clientpositive/extrapolate_part_stats_full.q.out b212da907b 
>   ql/src/test/results/clientpositive/extrapolate_part_stats_partial.q.out b5f4feede0

>   ql/src/test/results/clientpositive/fm-sketch.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/hll.q.out b9357c3043 
>   ql/src/test/results/clientpositive/llap/autoColumnStats_2.q.out f29f7b5d1a 
>   ql/src/test/results/clientpositive/llap/column_names_with_leading_and_trailing_spaces.q.out
fb833bccb2 
>   ql/src/test/results/clientpositive/llap/columnstats_part_coltype.q.out 5e647433f1 
>   ql/src/test/results/clientpositive/llap/deleteAnalyze.q.out 5db87d97cf 
>   ql/src/test/results/clientpositive/llap/extrapolate_part_stats_partial_ndv.q.out 6bc1970ad0

>   ql/src/test/results/clientpositive/llap/llap_smb.q.out 87b33db805 
>   ql/src/test/results/clientpositive/llap/stats_only_null.q.out 57aaf557b2 
>   ql/src/test/results/clientpositive/llap/varchar_udf1.q.out 2e9d88e343 
>   ql/src/test/results/clientpositive/llap/vector_udf1.q.out 9a164fe130 
>   ql/src/test/results/clientpositive/partial_column_stats.q.out 87d47dae22 
>   ql/src/test/results/clientpositive/partition_coltype_literals.q.out d459b36ff0 
>   ql/src/test/results/clientpositive/reduceSinkDeDuplication_pRS_key_empty.q.out 4bddd3bef8

>   ql/src/test/results/clientpositive/rename_external_partition_location.q.out 19546c38bc

>   ql/src/test/results/clientpositive/rename_table_update_column_stats.q.out 16b3a38c46

>   ql/src/test/results/clientpositive/spark/avro_decimal_native.q.out b73b5f5679 
>   ql/src/test/results/clientpositive/spark/stats_only_null.q.out 359eea3acb 
>   ql/src/test/results/clientpositive/stats_only_null.q.out 88c2114356 
>   ql/src/test/results/clientpositive/temp_table_display_colstats_tbllvl.q.out ad92058cab

>   ql/src/test/results/clientpositive/tez/explainanalyze_5.q.out 626e1fd4d0 
>   ql/src/test/results/clientpositive/tunable_ndv.q.out 437beafc0d 
> 
> 
> Diff: https://reviews.apache.org/r/61009/diff/2/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> pengcheng xiong
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message