parquet-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ga...@apache.org
Subject [parquet-mr] branch bloom-filter updated (dd7e655 -> 1fc2733)
Date Tue, 24 Sep 2019 05:07:09 GMT
This is an automated email from the ASF dual-hosted git repository.

gabor pushed a change to branch bloom-filter
in repository https://gitbox.apache.org/repos/asf/parquet-mr.git.


    from dd7e655  PARQUET-1391: Integrate Bloom filter logic (#619)
     add ee97f23  PARQUET-1498: Add instructions to install thrift via homebrew (#595)
     add 4b40d96  PARQUET-1502: Convert FIXED_LEN_BYTE_ARRAY to arrow type in logicalTypeAnnotation
if it is not null (#593)
     add 354fcc2  [PARQUET-1506] Migrate  maven-thrift-plugin to thrift-maven-plugin (#600)
     add f36dd08  [PARQUET-1500] Replace Closeables with try-with-resources (#597)
     add 1e62e2e  PARQUET-1503: Remove Ints Utility Class (#598)
     add d1e9f15  PARQUET-1513: Update HiddenFileFilter to avoid extra startsWith (#606)
     add 00a7a47  PARQUET-1504: Add an option to convert Int96 to Arrow Timestamp (#594)
     add ddc7747  PARQUET-1509: Note Hive deprecation in README. (#602)
     add d9a1962  PARQUET-1510: Fix notEq for optional columns with null values. (#603)
     add 9d1006f  [PARQUET-1507] Bump Apache Thrift to 0.12.0 (#601)
     add 1b103da  PARQUET-1518: Use Jackson2 version 2.9.8 in parquet-cli (#609)
     add 51c4cc3  PARQUET-138: Allow merging more restrictive field in less restrictive field
(#550)
     add 3537c88  Add javax.annotation-api dependency for JDK >= 9 (#604)
     add 82935e6  PARQUET-1470: Inputstream leakage in ParquetFileWriter.appendFile (#611)
     add 5bd1265  PARQUET-1514: ParquetFileWriter Records Compressed Bytes instead of Uncompressed
Bytes (#607)
     add 714bb45  PARQUET-1505: Use Java 7 NIO StandardCharsets (#599)
     add 6901a20  PARQUET-1480 INT96 to avro not yet implemented error should mention deprecation
(#579)
     add 7dcdcdc  PARQUET-1485: Fix Snappy direct memory leak (#581)
     add 9461845  PARQUET-1527:  [parquet-tools] cat command throw java.lang.ClassCastException
(#612)
     add dcfd53a  PARQUET-1529: Shade fastutil in all modules where used (#617)
     add f2c5b9a  Update CHANGES.md for 1.11.0rc4
     add 22a9f54  [maven-release-plugin] prepare release apache-parquet-1.11.0
     add 4cc22dd  [maven-release-plugin] prepare for next development iteration
     add f799893  PARQUET-1533: TestSnappy() throws OOM exception with Parquet-1485 change
(#622)
     add ab42fe5  Revert "PARQUET-1381: Add merge blocks command to parquet-tools (#512)"
(#621)
     add 892dedb  PARQUET-1531: Page row count limit causes empty pages to be written from
MessageColumnIO (#620)
     add acaf9e6  Update CHANGES.md for 1.11.0rc5
     add e85dbd3  [maven-release-plugin] prepare release apache-parquet-1.11.0
     add ac65040  [maven-release-plugin] prepare for next development iteration
     add 3e3049d  PARQUET-1544: Possible over-shading of modules (#628)
     add 9bf2517  Update CHANGES.md for 1.11.0rc6
     add 9756b0e  [maven-release-plugin] prepare release apache-parquet-1.11.0
     add 811310d  [maven-release-plugin] prepare for next development iteration
     add c5561be  Correct brew formula for thirft (#626)
     add 62dcc68  PARQUET-1557: Replace deprecated Apache Avro methods (#633)
     add 12f3fd2  PARQUET-1558: Use try-with-resource in Apache Avro tests (#634)
     add 2c257f0  PARQUET-1557: Replace deprecated Apache Avro methods (#636)
     add 63cfb56  PARQUET-1555: Bump snappy-java to 1.1.7.3 (#632)
     add 222b091  PARQUET-1585: Update old external links in the code base (#644)
     add 96a0148  PARQUET-1577 Remove duplicate license (#640)
     add c3e1a84  PARQUET-1536: [parquet-cli] Add simple tests for each command (#625)
     add f1a719b  PARQUET-1534:  [parquet-cli] IllegalArgumentException on Windows (#627)
     add 882cbf8  PARQUET-1579 Add Github PR template (#642)
     add 1e5fda5  PARQUET-1441: SchemaParseException: Can't redefine: list in AvroIndexedRecordConverter
(#560)
     add 21c45ed  Merge branch 'master' into bloom-filter
     add 54777b4  PARQUET-1556 Use try-with-resource in Apache Avro tests (#639)
     add 9d6fb45  PARQUET-1576 Bump Apache Avro to 1.9.0 (#638)
     add 47398be  PARQUET-1375: Upgrade to Jackson 2.9.9 (#616)
     add 0b909ab  PARQUET-1550: CleanUtil does not work in Java 11 (#654)
     add f7e74ba  PARQUET-1604: Bump fastutil from 7.0.13 to 8.2.3 (#655)
     add 8ff867a  PARQUET-1615: getRecordWriter shouldn't hardcode CREAT mode when new ParquetFileWriter
(#660)
     add 1487456  PARQUET-1552: upgrade protoc-jar-maven-plugin to 3.8.0 to fix proxy issue
(#659)
     add 0861ddf  PARQUET-1616: Enable Maven batch mode (#661)
     add b34b077  PARQUET-1488: UserDefinedPredicate throw NPE (#663)
     add 54d3703  PARQUET-1600: Fix shebang in parquet-benchmarks/run.sh (#651)
     add 14958d4  PARQUET-1606: Fix invalid tests scope (#657)
     add fcc5d1a  PARQUET-1580: Page-level CRC checksum verfication for DataPageV1 (#647)
     add 347178e  PARQUET-1605: Bump maven-javadoc-plugin from 2.9 to 3.1.0 (#656)
     add 93af6b4  PARQUET-1303 correct ClassCastException for Avro @Stringable fields (#482)
     add 8ab68bc  Revert "PARQUET-1605: Bump maven-javadoc-plugin from 2.9 to 3.1.0 (#656)"
     add 0d9bad5  PARQUET-1637: Builds are failing because default jdk changed to openjdk11
on Travis (#665)
     add 340d157  PARQUET-1530: Remove Dependency on commons-codec (#618)
     add 14c1e81  PARQUET-1445: Remove Files.java (#584)
     add be00486  PARQUET-1607: Remove duplicate maven-enforcer-plugin (#658)
     add 1fc2733  Merge branch 'master' into bloom-filter

No new revisions were added by this update.

Summary of changes:
 .github/PULL_REQUEST_TEMPLATE.md                   |  26 +
 .travis.yml                                        |   3 +-
 CHANGES.md                                         |  40 +-
 README.md                                          |  17 +-
 dev/travis-before_install-bloom-filter.sh          |   3 +-
 dev/travis-before_install.sh                       |   8 +-
 parquet-arrow/pom.xml                              |   2 +-
 .../parquet/arrow/schema/SchemaConverter.java      |  28 +-
 .../parquet/arrow/schema/TestSchemaConverter.java  |  42 ++
 parquet-avro/pom.xml                               |  25 +
 .../parquet/avro/AvroIndexedRecordConverter.java   |   2 +-
 .../apache/parquet/avro/AvroRecordConverter.java   |   2 +-
 .../apache/parquet/avro/AvroSchemaConverter.java   |  29 +-
 .../org/apache/parquet/avro/AvroWriteSupport.java  |   4 +-
 .../java/org/apache/parquet/avro/AvroTestUtil.java |  39 +-
 .../parquet/avro/TestArrayCompatibility.java       |   9 +-
 .../parquet/avro/TestAvroSchemaConverter.java      |  85 +++-
 .../parquet/avro/TestGenericLogicalTypes.java      |  18 -
 .../apache/parquet/avro/TestInputOutputFormat.java |  20 +-
 .../org/apache/parquet/avro/TestReadWrite.java     | 392 +++++++-------
 .../parquet/avro/TestReadWriteOldListBehavior.java | 267 +++++-----
 .../parquet/avro/TestReflectInputOutputFormat.java |  74 +--
 .../parquet/avro/TestReflectLogicalTypes.java      |  25 +-
 .../apache/parquet/avro/TestReflectReadWrite.java  |  56 +-
 .../avro/TestSpecificInputOutputFormat.java        |  74 +--
 .../apache/parquet/avro/TestSpecificReadWrite.java | 151 +++---
 .../apache/parquet/avro/TestStringBehavior.java    |  77 +--
 parquet-avro/src/test/resources/nested_array.avsc  |  39 ++
 parquet-benchmarks/run.sh                          |   3 +-
 .../run_checksums.sh                               |  29 +-
 .../apache/parquet/benchmarks/BenchmarkFiles.java  |  22 +
 .../benchmarks/NestedNullWritingBenchmarks.java    | 151 ++++++
 .../benchmarks/PageChecksumDataGenerator.java      | 127 +++++
 .../benchmarks/PageChecksumReadBenchmarks.java     | 179 +++++++
 .../benchmarks/PageChecksumWriteBenchmarks.java    | 160 ++++++
 parquet-cascading/pom.xml                          |   2 +-
 parquet-cascading3/pom.xml                         |  26 +-
 parquet-cli/pom.xml                                |  16 +-
 .../java/org/apache/parquet/cli/BaseCommand.java   |  23 +-
 .../src/main/java/org/apache/parquet/cli/Util.java |  17 +-
 .../java/org/apache/parquet/cli/json/AvroJson.java |  15 +-
 .../apache/parquet/cli/json/AvroJsonReader.java    |  22 +-
 .../java/org/apache/parquet/cli/util/Schemas.java  |  10 +-
 parquet-cli/src/main/resources/META-INF/LICENSE    |   5 +-
 .../org/apache/parquet/cli/BaseCommandTest.java    | 100 ++++
 .../apache/parquet/cli/commands/AvroFileTest.java} |  34 +-
 .../apache/parquet/cli/commands/CSVFileTest.java   |  48 +-
 .../cli/commands/CSVSchemaCommandTest.java}        |  22 +-
 .../parquet/cli/commands/CatCommandTest.java}      |  21 +-
 .../cli/commands/CheckParquet251CommandTest.java}  |  21 +-
 .../cli/commands/ConvertCSVCommandTest.java        |  32 +-
 .../parquet/cli/commands/ConvertCommandTest.java   |  32 +-
 .../org/apache/parquet/cli/commands/FileTest.java  |  57 +++
 .../parquet/cli/commands/ParquetFileTest.java      | 101 ++++
 .../cli/commands/ParquetMetadataCommandTest.java}  |  21 +-
 .../parquet/cli/commands/SchemaCommandTest.java}   |  21 +-
 .../parquet/cli/commands/ShowColumnIndexTest.java} |  21 +-
 .../cli/commands/ShowDictionaryCommandTest.java    |  30 +-
 .../cli/commands/ShowPagesCommandTest.java}        |  21 +-
 .../parquet/cli/commands/ToAvroCommandTest.java}   |  16 +-
 parquet-column/pom.xml                             |  11 -
 .../apache/parquet/column/ColumnWriteStore.java    |  12 +-
 .../apache/parquet/column/ParquetProperties.java   |  27 +-
 .../parquet/column/impl/ColumnReadStoreImpl.java   |   2 +-
 .../parquet/column/impl/ColumnWriteStoreBase.java  |   5 +
 .../parquet/column/impl/ColumnWriterBase.java      |   3 +
 .../apache/parquet/column/impl/ColumnWriterV2.java |   3 +-
 .../org/apache/parquet/column/page/DataPageV1.java |   5 +-
 .../org/apache/parquet/column/page/DataPageV2.java |   9 +-
 .../apache/parquet/column/page/DictionaryPage.java |   3 +-
 .../java/org/apache/parquet/column/page/Page.java  |  16 +
 .../rle/RunLengthBitPackingHybridValuesWriter.java |   3 +-
 .../filter2/predicate/UserDefinedPredicate.java    |  15 +
 .../column/columnindex/ColumnIndexBuilder.java     |   4 +-
 .../filter2/columnindex/ColumnIndexFilter.java     |   4 +-
 .../org/apache/parquet/io/MessageColumnIO.java     |   7 +
 .../java/org/apache/parquet/io/api/Binary.java     |  30 +-
 .../java/org/apache/parquet/schema/GroupType.java  |   3 -
 .../org/apache/parquet/schema/PrimitiveType.java   |   3 +-
 .../main/java/org/apache/parquet/schema/Type.java  |  23 +
 .../column/values/dictionary/TestDictionary.java   |   7 +-
 .../org/apache/parquet/schema/TestMessageType.java |  17 +-
 .../apache/parquet/schema/TestRepetitionType.java  |  28 +-
 .../main/java/org/apache/parquet/Closeables.java   |   8 +-
 .../src/main/java/org/apache/parquet/Files.java    |   5 +-
 .../src/main/java/org/apache/parquet/Ints.java     |   2 +
 .../java/org/apache/parquet/bytes/BytesUtils.java  |   3 +
 parquet-encoding/pom.xml                           |   6 -
 parquet-format-structures/pom.xml                  |  13 +-
 ...crementallyUpdatedFilterPredicateGenerator.java |   2 +-
 parquet-hadoop/pom.xml                             |  15 +-
 .../java/org/apache/parquet/HadoopReadOptions.java |  15 +-
 .../org/apache/parquet/ParquetReadOptions.java     |  30 +-
 .../filter2/dictionarylevel/DictionaryFilter.java  |   9 +-
 .../filter2/statisticslevel/StatisticsFilter.java  |   8 +-
 .../format/converter/ParquetMetadataConverter.java |  51 +-
 .../parquet/hadoop/ColumnChunkPageReadStore.java   |  28 +-
 .../parquet/hadoop/ColumnChunkPageWriteStore.java  |  41 +-
 .../hadoop/InternalParquetRecordWriter.java        |   2 +-
 .../apache/parquet/hadoop/ParquetFileReader.java   | 156 +++---
 .../apache/parquet/hadoop/ParquetFileWriter.java   | 180 ++-----
 .../apache/parquet/hadoop/ParquetInputFormat.java  |   5 +
 .../apache/parquet/hadoop/ParquetOutputFormat.java |  37 +-
 .../org/apache/parquet/hadoop/ParquetReader.java   |  10 +
 .../org/apache/parquet/hadoop/ParquetWriter.java   |  24 +-
 .../org/apache/parquet/hadoop/codec/CleanUtil.java | 112 ++++
 .../parquet/hadoop/codec/SnappyCompressor.java     |   7 +-
 .../parquet/hadoop/codec/SnappyDecompressor.java   |  13 +-
 .../hadoop/example/ExampleParquetWriter.java       |  16 +
 .../parquet/hadoop/metadata/ParquetMetadata.java   |  13 +-
 .../apache/parquet/hadoop/util/BlocksCombiner.java | 106 ----
 .../parquet/hadoop/util/HiddenFileFilter.java      |   8 +-
 .../parquet/hadoop/util/SerializationUtil.java     |  57 +--
 .../dictionarylevel/DictionaryFilterTest.java      |  17 +-
 .../hadoop/TestColumnChunkPageWriteStore.java      |   1 +
 .../parquet/hadoop/TestColumnIndexFiltering.java   |   4 +-
 .../parquet/hadoop/TestDataPageV1Checksums.java    | 563 +++++++++++++++++++++
 .../hadoop/TestInputOutputFormatWithPadding.java   |   9 +-
 .../parquet/hadoop/TestParquetFileWriter.java      |   4 +
 .../apache/parquet/hadoop/TestParquetWriter.java   |  43 +-
 .../hadoop/TestParquetWriterMergeBlocks.java       | 280 ----------
 .../hadoop/example/TestInputOutputFormat.java      |  15 +-
 parquet-jackson/README.md                          |  11 +-
 parquet-jackson/pom.xml                            |   8 +-
 parquet-pig/pom.xml                                |   4 +-
 .../parquet/pig/summary/StringSummaryData.java     |  10 +-
 .../org/apache/parquet/pig/summary/Summary.java    |   4 -
 .../apache/parquet/pig/summary/SummaryData.java    |  16 +-
 .../apache/parquet/pig/summary/TestSummary.java    |   4 +-
 parquet-protobuf/pom.xml                           |   2 +-
 parquet-scala/pom.xml                              |   4 +-
 parquet-scrooge/pom.xml                            |  12 +-
 parquet-thrift/pom.xml                             |   4 +-
 .../parquet/thrift/struct/CompatibilityRunner.java |   2 +-
 .../org/apache/parquet/thrift/struct/JSON.java     |  11 +-
 .../apache/parquet/thrift/struct/ThriftField.java  |   4 +-
 .../apache/parquet/thrift/struct/ThriftType.java   |  15 +-
 .../parquet/thrift/TestThriftRecordConverter.java  |   8 +-
 .../apache/parquet/tools/command/DumpCommand.java  |  18 +
 .../apache/parquet/tools/command/MergeCommand.java |  75 +--
 .../parquet/tools/json/JsonRecordFormatter.java    |   2 +-
 .../apache/parquet/tools/read/SimpleMapRecord.java |   2 +-
 .../apache/parquet/tools/read/SimpleRecord.java    |   4 +-
 .../parquet/tools/read/SimpleRecordConverter.java  |   1 +
 parquet-tools/src/main/resources/META-INF/LICENSE  |   9 -
 .../tools/read/TestJsonRecordFormatter.java        |   2 +-
 .../tools/read/TestSimpleRecordConverter.java      | 139 +++++
 pom.xml                                            |  19 +-
 148 files changed, 3629 insertions(+), 1881 deletions(-)
 create mode 100644 .github/PULL_REQUEST_TEMPLATE.md
 create mode 100644 parquet-avro/src/test/resources/nested_array.avsc
 copy .editorconfig => parquet-benchmarks/run_checksums.sh (68%)
 mode change 100644 => 100755
 create mode 100644 parquet-benchmarks/src/main/java/org/apache/parquet/benchmarks/NestedNullWritingBenchmarks.java
 create mode 100644 parquet-benchmarks/src/main/java/org/apache/parquet/benchmarks/PageChecksumDataGenerator.java
 create mode 100644 parquet-benchmarks/src/main/java/org/apache/parquet/benchmarks/PageChecksumReadBenchmarks.java
 create mode 100644 parquet-benchmarks/src/main/java/org/apache/parquet/benchmarks/PageChecksumWriteBenchmarks.java
 create mode 100644 parquet-cli/src/test/java/org/apache/parquet/cli/BaseCommandTest.java
 copy parquet-cli/src/{main/java/org/apache/parquet/cli/util/Formats.java => test/java/org/apache/parquet/cli/commands/AvroFileTest.java}
(58%)
 copy parquet-common/src/main/java/org/apache/parquet/Files.java => parquet-cli/src/test/java/org/apache/parquet/cli/commands/CSVFileTest.java
(50%)
 copy parquet-cli/src/{main/java/org/apache/parquet/cli/util/RuntimeIOException.java =>
test/java/org/apache/parquet/cli/commands/CSVSchemaCommandTest.java} (59%)
 copy parquet-cli/src/{main/java/org/apache/parquet/cli/util/RuntimeIOException.java =>
test/java/org/apache/parquet/cli/commands/CatCommandTest.java} (61%)
 copy parquet-cli/src/{main/java/org/apache/parquet/cli/util/RuntimeIOException.java =>
test/java/org/apache/parquet/cli/commands/CheckParquet251CommandTest.java} (59%)
 copy parquet-avro/src/test/java/org/apache/parquet/avro/TestAvroDataSupplier.java => parquet-cli/src/test/java/org/apache/parquet/cli/commands/ConvertCSVCommandTest.java
(58%)
 copy parquet-avro/src/test/java/org/apache/parquet/avro/TestAvroDataSupplier.java => parquet-cli/src/test/java/org/apache/parquet/cli/commands/ConvertCommandTest.java
(59%)
 create mode 100644 parquet-cli/src/test/java/org/apache/parquet/cli/commands/FileTest.java
 create mode 100644 parquet-cli/src/test/java/org/apache/parquet/cli/commands/ParquetFileTest.java
 copy parquet-cli/src/{main/java/org/apache/parquet/cli/util/RuntimeIOException.java =>
test/java/org/apache/parquet/cli/commands/ParquetMetadataCommandTest.java} (59%)
 copy parquet-cli/src/{main/java/org/apache/parquet/cli/util/RuntimeIOException.java =>
test/java/org/apache/parquet/cli/commands/SchemaCommandTest.java} (61%)
 copy parquet-cli/src/{main/java/org/apache/parquet/cli/util/RuntimeIOException.java =>
test/java/org/apache/parquet/cli/commands/ShowColumnIndexTest.java} (59%)
 copy parquet-avro/src/test/java/org/apache/parquet/avro/TestAvroDataSupplier.java => parquet-cli/src/test/java/org/apache/parquet/cli/commands/ShowDictionaryCommandTest.java
(59%)
 copy parquet-cli/src/{main/java/org/apache/parquet/cli/util/RuntimeIOException.java =>
test/java/org/apache/parquet/cli/commands/ShowPagesCommandTest.java} (60%)
 copy parquet-cli/src/{main/java/org/apache/parquet/cli/util/RuntimeIOException.java =>
test/java/org/apache/parquet/cli/commands/ToAvroCommandTest.java} (73%)
 copy parquet-encoding/src/test/java/org/apache/parquet/bytes/TestBytesInput.java => parquet-column/src/test/java/org/apache/parquet/schema/TestRepetitionType.java
(59%)
 create mode 100644 parquet-hadoop/src/main/java/org/apache/parquet/hadoop/codec/CleanUtil.java
 delete mode 100644 parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/BlocksCombiner.java
 create mode 100644 parquet-hadoop/src/test/java/org/apache/parquet/hadoop/TestDataPageV1Checksums.java
 delete mode 100644 parquet-hadoop/src/test/java/org/apache/parquet/hadoop/TestParquetWriterMergeBlocks.java
 create mode 100644 parquet-tools/src/test/java/org/apache/parquet/tools/read/TestSimpleRecordConverter.java


Mime
View raw message