parquet-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From w...@apache.org
Subject parquet-cpp git commit: PARQUET-760: Store correct encoding in fallback data pages
Date Tue, 01 Nov 2016 01:25:29 GMT
Repository: parquet-cpp
Updated Branches:
  refs/heads/master 9a0407e68 -> 69db1a835


PARQUET-760: Store correct encoding in fallback data pages

Author: Uwe L. Korn <uwelk@xhochy.com>

Closes #182 from xhochy/PARQUET-760 and squashes the following commits:

1791506 [Uwe L. Korn] PARQUET-760: Store correct encoding in fallback data pages


Project: http://git-wip-us.apache.org/repos/asf/parquet-cpp/repo
Commit: http://git-wip-us.apache.org/repos/asf/parquet-cpp/commit/69db1a83
Tree: http://git-wip-us.apache.org/repos/asf/parquet-cpp/tree/69db1a83
Diff: http://git-wip-us.apache.org/repos/asf/parquet-cpp/diff/69db1a83

Branch: refs/heads/master
Commit: 69db1a83556bd9d1d168617406f11d9aaac9ec76
Parents: 9a0407e
Author: Uwe L. Korn <uwelk@xhochy.com>
Authored: Mon Oct 31 21:25:22 2016 -0400
Committer: Wes McKinney <wes.mckinney@twosigma.com>
Committed: Mon Oct 31 21:25:22 2016 -0400

----------------------------------------------------------------------
 src/parquet/column/column-writer-test.cc | 9 +++++----
 src/parquet/column/writer.cc             | 1 +
 2 files changed, 6 insertions(+), 4 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/parquet-cpp/blob/69db1a83/src/parquet/column/column-writer-test.cc
----------------------------------------------------------------------
diff --git a/src/parquet/column/column-writer-test.cc b/src/parquet/column/column-writer-test.cc
index 745efe7..2269e8f 100644
--- a/src/parquet/column/column-writer-test.cc
+++ b/src/parquet/column/column-writer-test.cc
@@ -381,10 +381,11 @@ TYPED_TEST(TestPrimitiveWriter, RequiredVeryLargeChunk) {
   writer->WriteBatch(this->values_.size(), nullptr, nullptr, this->values_ptr_);
   writer->Close();
 
-  // Just read the first SMALL_SIZE rows to ensure we could read it back in
-  this->ReadColumn();
-  ASSERT_EQ(SMALL_SIZE, this->values_read_);
-  this->values_.resize(SMALL_SIZE);
+  // Read all rows so we are sure that also the non-dictionary pages are read correctly
+  this->SetupValuesOut(VERY_LARGE_SIZE);
+  this->ReadColumnFully();
+  ASSERT_EQ(VERY_LARGE_SIZE, this->values_read_);
+  this->values_.resize(VERY_LARGE_SIZE);
   ASSERT_EQ(this->values_, this->values_out_);
   std::vector<Encoding::type> encodings = this->metadata_encodings();
   // There are 3 encodings (RLE, PLAIN_DICTIONARY, PLAIN) in a fallback case

http://git-wip-us.apache.org/repos/asf/parquet-cpp/blob/69db1a83/src/parquet/column/writer.cc
----------------------------------------------------------------------
diff --git a/src/parquet/column/writer.cc b/src/parquet/column/writer.cc
index d1c3fe2..92a5e09 100644
--- a/src/parquet/column/writer.cc
+++ b/src/parquet/column/writer.cc
@@ -216,6 +216,7 @@ void TypedColumnWriter<Type>::CheckDictionarySizeLimit() {
     fallback_ = true;
     // Only PLAIN encoding is supported for fallback in V1
     current_encoder_.reset(new PlainEncoder<Type>(descr_, properties_->allocator()));
+    encoding_ = Encoding::PLAIN;
   }
 }
 


Mime
View raw message