hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brock Noland" <br...@cloudera.com>
Subject Re: Review Request 28964: HIVE-8121 Create micro-benchmarks for ParquetSerde and evaluate performance
Date Sun, 11 Jan 2015 21:29:07 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28964/#review67616
-----------------------------------------------------------


Nice work Sergio!!

I know that it doesn't fit perfectly into the JMH model but I think we have to write a non-trival
amount of records such as 1000 rows in order to get much benefit. Can we try that?


itests/hive-jmh/pom.xml
<https://reviews.apache.org/r/28964/#comment111684>

    It looks like in this file 1 tab = 4 spaces whereas in Hive I think we typically say 1
tab = 2 spaces



itests/hive-jmh/src/main/java/org/apache/hive/benchmark/storage/ColumnarStorageBench.java
<https://reviews.apache.org/r/28964/#comment111683>

    During class initialization let's create an array of 100 random values for each type and
then we can iterate through that array for each call to this method.
    
    Otherwise columnar formals will lead to unrealistic comppression for storing the same
values over and over. For example both parquet and orc should be able to collapse a column
consist of the integer 1 to a trivial amount of data.


- Brock Noland


On Jan. 9, 2015, 6:38 p.m., Sergio Pena wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28964/
> -----------------------------------------------------------
> 
> (Updated Jan. 9, 2015, 6:38 p.m.)
> 
> 
> Review request for hive, Brock Noland and cheng xu.
> 
> 
> Bugs: HIVE-8121
>     https://issues.apache.org/jira/browse/HIVE-8121
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This is a new tool used to test ORC & PARQUET file format performance.
> 
> 
> Diffs
> -----
> 
>   itests/hive-jmh/pom.xml PRE-CREATION 
>   itests/hive-jmh/src/main/java/org/apache/hive/benchmark/storage/ColumnarStorageBench.java
PRE-CREATION 
>   itests/pom.xml 0a154d6eb8c119e4e6419777c28b59b9d2108ba0 
> 
> Diff: https://reviews.apache.org/r/28964/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message