hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergey Shelukhin <ser...@hortonworks.com>
Subject Re: Review Request 45238: HIVE-9660 store end offset of compressed data for RG in RowIndex in ORC
Date Fri, 22 Apr 2016 19:22:13 GMT


> On April 12, 2016, 12:52 a.m., Prasanth_J wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 1205
> > <https://reviews.apache.org/r/45238/diff/7/?file=1339688#file1339688line1205>
> >
> >     Can you make the prefix consistent? rename
> >     hive.exec.orc.disk.range.use.lengths? As disk range computation is the only
place where this lengths are used.

added exec; I don't think adding disk range is a good idea; that's the only place where you
could use them in the first place. Also if someone does come up with different usage it should
be the same variable.


> On April 12, 2016, 12:52 a.m., Prasanth_J wrote:
> > orc/src/java/org/apache/orc/impl/BitFieldWriter.java, line 44
> > <https://reviews.apache.org/r/45238/diff/7/?file=1339696#file1339696line44>
> >
> >     I don't understand 2 things here.
> >     1) flushedValuesDuringWriter is always set to -1 in the second assignment
> >     2) bitsLeft will always be 0. See below writeByte is called when bitsLeft ==
0. In another place bitsLeft is set to 0 before calling writeByte
> >     
> >     This seems to be redundant. Is that correct?

(1) is correct. We just save it when the write call is executing, if someone wants to find
out the number of values then; the normal formula may break during write.
(2) is not; flush calls it without bitsLeft being equal to 0


> On April 12, 2016, 12:52 a.m., Prasanth_J wrote:
> > orc/src/java/org/apache/orc/impl/BitFieldWriter.java, line 66
> > <https://reviews.apache.org/r/45238/diff/7/?file=1339696#file1339696line66>
> >
> >     this is not needed as writeByte() sets bitsLeft=8 anyways.

needed for writeBytes to update the value correctly (see the response above)


> On April 12, 2016, 12:52 a.m., Prasanth_J wrote:
> > orc/src/java/org/apache/orc/impl/RunLengthByteWriter.java, line 67
> > <https://reviews.apache.org/r/45238/diff/7/?file=1339699#file1339699line67>
> >
> >     Can this be moved outside the condition? Every write() call is going to increment
this, so better to update this value once outside the condition.

That is not quite true. Added a comment.


> On April 12, 2016, 12:52 a.m., Prasanth_J wrote:
> > orc/src/java/org/apache/orc/impl/WriterImpl.java, line 668
> > <https://reviews.apache.org/r/45238/diff/7/?file=1339703#file1339703line668>
> >
> >     Should this come at the top? or even before this method invocation? I guess
the removeIsPresentPositions will be called regardless if isTracking is enabled or disabled.
Is that correct?

Yes


> On April 12, 2016, 12:52 a.m., Prasanth_J wrote:
> > orc/src/java/org/apache/orc/impl/WriterImpl.java, line 911
> > <https://reviews.apache.org/r/45238/diff/7/?file=1339703#file1339703line911>
> >
> >     Should we iterate only if isTrackingLengths is true?

Yes, because the trackers can also be used for the dictionary. Added a comment.


- Sergey


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45238/#review128268
-----------------------------------------------------------


On April 11, 2016, 7:16 p.m., Sergey Shelukhin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45238/
> -----------------------------------------------------------
> 
> (Updated April 11, 2016, 7:16 p.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 95c5c0e 
>   data/conf/hive-log4j2.properties 6bace1f 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java
fb0867d 
>   llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcStripeMetadata.java
82187bd 
>   orc/src/gen/protobuf-java/org/apache/orc/OrcProto.java 24715c3 
>   orc/src/java/org/apache/orc/OrcConf.java 6fcbb72 
>   orc/src/java/org/apache/orc/OrcFile.java 85506ff 
>   orc/src/java/org/apache/orc/TypeDescription.java bd900ac 
>   orc/src/java/org/apache/orc/impl/BitFieldWriter.java aa5f886 
>   orc/src/java/org/apache/orc/impl/IntegerWriter.java 419054f 
>   orc/src/java/org/apache/orc/impl/OutStream.java 81662cc 
>   orc/src/java/org/apache/orc/impl/RunLengthByteWriter.java 09108b2 
>   orc/src/java/org/apache/orc/impl/RunLengthIntegerWriter.java 3e5f2e2 
>   orc/src/java/org/apache/orc/impl/RunLengthIntegerWriterV2.java fab2801 
>   orc/src/java/org/apache/orc/impl/SerializationUtils.java 2e5a59b 
>   orc/src/java/org/apache/orc/impl/WriterImpl.java f8afe06 
>   orc/src/protobuf/orc_proto.proto f4935b4 
>   orc/src/test/org/apache/orc/impl/TestBitFieldReader.java e4c6f6b 
>   orc/src/test/org/apache/orc/impl/TestBitPack.java f2d3d64 
>   orc/src/test/org/apache/orc/impl/TestInStream.java 9e65345 
>   orc/src/test/org/apache/orc/impl/TestIntegerCompressionReader.java 399f35e 
>   orc/src/test/org/apache/orc/impl/TestOutStream.java e9614d5 
>   orc/src/test/org/apache/orc/impl/TestRunLengthByteReader.java a14bef1 
>   orc/src/test/org/apache/orc/impl/TestRunLengthIntegerReader.java 28239ba 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java 8aca779 
>   ql/src/java/org/apache/hadoop/hive/ql/hooks/PostExecOrcFileDump.java d5d1370 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/FileDump.java 9c2f88f 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/JsonFileDump.java 00de545 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/ReaderImpl.java a031a92 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java 3975409 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderUtils.java 8a73948 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReader.java 4d09dcd 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java f4cfa53

>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/ReaderImpl.java 4856fb3 
>   ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInputOutputFormat.java 85923a8 
>   ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestJsonFileDump.java acf232d 
>   ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestRecordReaderImpl.java 6803abd 
>   ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestStringDictionary.java 41a211b 
>   ql/src/test/queries/clientpositive/orc_lengths.q PRE-CREATION 
>   ql/src/test/resources/orc-file-dump-bloomfilter.out 18fd2fb 
>   ql/src/test/resources/orc-file-dump-bloomfilter2.out fa5cc2d 
>   ql/src/test/resources/orc-file-dump-dictionary-threshold.out 17a964b 
>   ql/src/test/resources/orc-file-dump.json bf654a1 
>   ql/src/test/resources/orc-file-dump.out 70f7fbd 
>   ql/src/test/resources/orc-file-has-null.out e98a73f 
>   ql/src/test/results/clientpositive/acid_globallimit.q.out d43ed42 
>   ql/src/test/results/clientpositive/alter_merge_orc.q.out b5a6d04 
>   ql/src/test/results/clientpositive/alter_merge_stats_orc.q.out 0d5ba01 
>   ql/src/test/results/clientpositive/annotate_stats_part.q.out 131cf6a 
>   ql/src/test/results/clientpositive/annotate_stats_table.q.out 6db4ded 
>   ql/src/test/results/clientpositive/columnStatsUpdateForStatsOptimizer_2.q.out 179bc66

>   ql/src/test/results/clientpositive/dynpart_sort_opt_vectorization.q.out d03bfe4 
>   ql/src/test/results/clientpositive/dynpart_sort_optimization2.q.out 3b24a2e 
>   ql/src/test/results/clientpositive/extrapolate_part_stats_full.q.out a30c356 
>   ql/src/test/results/clientpositive/extrapolate_part_stats_partial.q.out 4e589b8 
>   ql/src/test/results/clientpositive/extrapolate_part_stats_partial_ndv.q.out 3185f70

>   ql/src/test/results/clientpositive/llap/llap_nullscan.q.out e17d721 
>   ql/src/test/results/clientpositive/orc_analyze.q.out 87855fa 
>   ql/src/test/results/clientpositive/orc_file_dump.q.out a97f6de 
>   ql/src/test/results/clientpositive/orc_lengths.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/orc_llap.q.out 6fc73b7 
>   ql/src/test/results/clientpositive/orc_merge10.q.out cf70dcf 
>   ql/src/test/results/clientpositive/orc_merge11.q.out 8a4d8e9 
>   ql/src/test/results/clientpositive/orc_merge12.q.out f23be5a 
>   ql/src/test/results/clientpositive/schema_evol_stats.q.out 63dab2e 
>   ql/src/test/results/clientpositive/spark/alter_merge_orc.q.out b5a6d04 
>   ql/src/test/results/clientpositive/spark/alter_merge_stats_orc.q.out 0d5ba01 
>   ql/src/test/results/clientpositive/spark/bucket5.q.out 5baf054 
>   ql/src/test/results/clientpositive/spark/infer_bucket_sort_map_operators.q.out a343d93

>   ql/src/test/results/clientpositive/spark/infer_bucket_sort_reducers_power_two.q.out
d30c1f0 
>   ql/src/test/results/clientpositive/spark/list_bucket_dml_10.q.java1.7.out 6b3c375 
>   ql/src/test/results/clientpositive/spark/orc_merge1.q.out 86df0a7 
>   ql/src/test/results/clientpositive/spark/orc_merge2.q.out b7f1a65 
>   ql/src/test/results/clientpositive/spark/orc_merge_diff_fs.q.out 86df0a7 
>   ql/src/test/results/clientpositive/spark/reduce_deduplicate.q.out 83988d3 
>   ql/src/test/results/clientpositive/spark/vectorized_ptf.q.out f0a4444 
>   ql/src/test/results/clientpositive/tez/alter_merge_orc.q.out b5a6d04 
>   ql/src/test/results/clientpositive/tez/alter_merge_stats_orc.q.out 0d5ba01 
>   ql/src/test/results/clientpositive/tez/dynpart_sort_opt_vectorization.q.out a90e3f6

>   ql/src/test/results/clientpositive/tez/dynpart_sort_optimization2.q.out 97f59d9 
>   ql/src/test/results/clientpositive/tez/explainuser_1.q.out c70f104 
>   ql/src/test/results/clientpositive/tez/explainuser_3.q.out f4e21bd 
>   ql/src/test/results/clientpositive/tez/llap_nullscan.q.out 39f04ea 
>   ql/src/test/results/clientpositive/tez/orc_analyze.q.out 87855fa 
>   ql/src/test/results/clientpositive/tez/orc_merge10.q.out bcba1bd 
>   ql/src/test/results/clientpositive/tez/orc_merge11.q.out 8a4d8e9 
>   ql/src/test/results/clientpositive/tez/orc_merge12.q.out f23be5a 
>   ql/src/test/results/clientpositive/tez/schema_evol_stats.q.out d396a61 
>   ql/src/test/results/clientpositive/tez/union_fast_stats.q.out 578205e 
>   ql/src/test/results/clientpositive/tez/vectorized_ptf.q.out 3d1f22f 
>   ql/src/test/results/clientpositive/union_fast_stats.q.out f0879af 
>   ql/src/test/results/clientpositive/vectorized_ptf.q.out 3b17591 
> 
> Diff: https://reviews.apache.org/r/45238/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message