parquet-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From w...@apache.org
Subject [4/4] parquet-cpp git commit: PARQUET-818: Refactoring to utilize common IO, buffer, memory management abstractions and implementations
Date Fri, 30 Dec 2016 16:36:38 GMT
PARQUET-818: Refactoring to utilize common IO, buffer, memory management abstractions and implementations

This refactoring is a bit of a bloodbath, but I've attempted to preserve as much API backwards compatibility as possible.

Several points

* Arrow does not use exceptions, so will need to be very careful about making sure that no Status goes unchecked. I've tried to get most of them, but might have missed some
* parquet-cpp still exposes an abstract file read and write API as before, but this makes it easy to pass in an Arrow file handle (e.g. HDFS, OS files, memory maps, etc.)
* Custom memory allocators will need to subclass `arrow::MemoryPool` instead. If this becomes onerous for some reason, we can try to find alternatives, but basically it's the exact same class as `parquet::MemoryAllocator`

Does not require any upstream changes in Arrow.

Author: Wes McKinney <wes.mckinney@twosigma.com>

Closes #210 from wesm/arrow-consolidation and squashes the following commits:

ef81084 [Wes McKinney] Configurable Arrow linkage. Slight .travis.yml cleaning
50b44f0 [Wes McKinney] Make some const refs
8438f86 [Wes McKinney] Revert ParquetFileReader::Open to use std::unique_ptr<RandomAccessFile>
671d981 [Wes McKinney] Actually tee output to console
ca8df13 [Wes McKinney] Do not hide test output from travis logs
f516115 [Wes McKinney] Add public link libs to dependencies to avoid race conditions with external projects
414c75f [Wes McKinney] README cleanups
be1acb5 [Wes McKinney] Move thirdparty ep's / setup to separate cmake module
46342ea [Wes McKinney] Remove unneeded ParquetAllocator interface, cleaning
b546f08 [Wes McKinney] Use MemoryAllocator alias within parquet core
8c1226d [Wes McKinney] Add Arrow to list of third party deps. Needs to be added to thirdparty
f9d8a2a [Wes McKinney] Check some unchecked Statuses
0d04820 [Wes McKinney] Fix benchmark builds. Do not fail in benchmarks if gtest.h is included due to <tr1/tuple> issue
ee312af [Wes McKinney] cpplint
6a05cd9 [Wes McKinney] Update installed header files
8d962f1 [Wes McKinney] Build and unit tests pass again
c82e2b4 [Wes McKinney] More refactoring
6ec5b71 [Wes McKinney] Re-expose original abstract IO interfaces, add Arrow subclasses that wrap inptu
c320c95 [Wes McKinney] clang-format
f10080c [Wes McKinney] Fix missed include
6ade22f [Wes McKinney] First cut refactoring, not fully compiling yet


Project: http://git-wip-us.apache.org/repos/asf/parquet-cpp/repo
Commit: http://git-wip-us.apache.org/repos/asf/parquet-cpp/commit/2154e873
Tree: http://git-wip-us.apache.org/repos/asf/parquet-cpp/tree/2154e873
Diff: http://git-wip-us.apache.org/repos/asf/parquet-cpp/diff/2154e873

Branch: refs/heads/master
Commit: 2154e873d5aa7280314189a2683fb1e12a590c02
Parents: 1c4012d
Author: Wes McKinney <wes.mckinney@twosigma.com>
Authored: Fri Dec 30 11:36:05 2016 -0500
Committer: Wes McKinney <wes.mckinney@twosigma.com>
Committed: Fri Dec 30 11:36:05 2016 -0500

----------------------------------------------------------------------
 .travis.yml                                     |   9 -
 CMakeLists.txt                                  | 492 +++--------------
 README.md                                       |  30 +-
 build-support/run-test.sh                       |   5 +-
 ci/before_script_travis.sh                      |  15 +
 cmake_modules/ThirdpartyToolchain.cmake         | 369 +++++++++++++
 examples/reader-writer.cc                       |   7 +-
 src/parquet/api/io.h                            |   5 +-
 src/parquet/arrow/CMakeLists.txt                |   5 -
 src/parquet/arrow/arrow-io-test.cc              | 140 -----
 .../arrow/arrow-reader-writer-benchmark.cc      |   6 +-
 src/parquet/arrow/arrow-reader-writer-test.cc   |  14 +-
 src/parquet/arrow/io.cc                         | 127 -----
 src/parquet/arrow/io.h                          | 101 ----
 src/parquet/arrow/reader.cc                     |  32 +-
 src/parquet/arrow/reader.h                      |   3 +-
 src/parquet/arrow/schema.cc                     |   1 -
 src/parquet/arrow/utils.h                       |  54 --
 src/parquet/arrow/writer.cc                     |   6 +-
 src/parquet/column/column-io-benchmark.cc       |  14 +-
 src/parquet/column/column-reader-test.cc        |   2 +-
 src/parquet/column/column-writer-test.cc        |   3 +-
 src/parquet/column/level-benchmark.cc           |   8 +-
 src/parquet/column/page.h                       |   2 +-
 src/parquet/column/properties.h                 |   5 +-
 src/parquet/column/reader.h                     |  17 +-
 src/parquet/column/scanner.cc                   |   1 +
 src/parquet/column/scanner.h                    |  10 +-
 src/parquet/column/statistics-test.cc           |  14 +-
 src/parquet/column/statistics.cc                |  35 +-
 src/parquet/column/statistics.h                 |  26 +-
 src/parquet/column/test-util.h                  |  10 +-
 src/parquet/column/writer.cc                    |  17 +-
 src/parquet/column/writer.h                     |   6 +-
 src/parquet/encodings/decoder.h                 |   2 +-
 src/parquet/encodings/delta-bit-pack-encoding.h |  19 +-
 src/parquet/encodings/dictionary-encoding.h     |  37 +-
 src/parquet/encodings/encoder.h                 |   3 +-
 src/parquet/encodings/encoding-benchmark.cc     |  18 +-
 src/parquet/encodings/encoding-test.cc          |  12 +-
 src/parquet/encodings/plain-encoding.h          |  32 +-
 src/parquet/file/file-deserialize-test.cc       |  11 +-
 src/parquet/file/file-serialize-test.cc         |   8 +-
 src/parquet/file/metadata.cc                    |   1 +
 src/parquet/file/metadata.h                     |   2 +-
 src/parquet/file/reader-internal.cc             |  36 +-
 src/parquet/file/reader-internal.h              |   8 +-
 src/parquet/file/reader.cc                      |  32 +-
 src/parquet/file/reader.h                       |  24 +-
 src/parquet/file/writer-internal.cc             |  14 +-
 src/parquet/file/writer-internal.h              |   6 +-
 src/parquet/file/writer.cc                      |  14 +-
 src/parquet/file/writer.h                       |  10 +-
 src/parquet/reader-test.cc                      |  24 +-
 src/parquet/thrift/util.h                       |   2 +-
 src/parquet/util/CMakeLists.txt                 |  11 +-
 src/parquet/util/buffer-test.cc                 |  65 ---
 src/parquet/util/buffer.cc                      | 123 -----
 src/parquet/util/buffer.h                       | 149 -----
 src/parquet/util/input-output-test.cc           | 244 ---------
 src/parquet/util/input.cc                       | 285 ----------
 src/parquet/util/input.h                        | 211 -------
 src/parquet/util/mem-allocator-test.cc          |  67 ---
 src/parquet/util/mem-allocator.cc               |  61 ---
 src/parquet/util/mem-allocator.h                |  59 --
 src/parquet/util/mem-pool-test.cc               | 247 ---------
 src/parquet/util/mem-pool.cc                    | 264 ---------
 src/parquet/util/mem-pool.h                     | 179 ------
 src/parquet/util/memory-test.cc                 | 385 +++++++++++++
 src/parquet/util/memory.cc                      | 543 +++++++++++++++++++
 src/parquet/util/memory.h                       | 440 +++++++++++++++
 src/parquet/util/output.cc                      | 118 ----
 src/parquet/util/output.h                       | 107 ----
 src/parquet/util/rle-encoding.h                 |   2 +-
 74 files changed, 2168 insertions(+), 3298 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/parquet-cpp/blob/2154e873/.travis.yml
----------------------------------------------------------------------
diff --git a/.travis.yml b/.travis.yml
index 5ca6de4..14df7f3 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -20,16 +20,9 @@ addons:
     - pkg-config
 matrix:
   fast_finish: true
-  allow_failures:
-  - env: PARQUET_TEST_GROUP=packaging
   include:
   - compiler: gcc
     os: linux
-    before_script:
-    - source $TRAVIS_BUILD_DIR/ci/before_script_travis.sh
-    - cmake -DCMAKE_CXX_FLAGS="-Werror" -DPARQUET_TEST_MEMCHECK=ON -DPARQUET_BUILD_BENCHMARKS=ON
-      -DPARQUET_ARROW=ON -DPARQUET_GENERATE_COVERAGE=1 $TRAVIS_BUILD_DIR
-    - export PARQUET_TEST_DATA=$TRAVIS_BUILD_DIR/data
   - compiler: clang
     os: linux
   - os: osx
@@ -62,8 +55,6 @@ before_install:
 
 before_script:
 - source $TRAVIS_BUILD_DIR/ci/before_script_travis.sh
-- cmake -DCMAKE_CXX_FLAGS="-Werror" -DPARQUET_ARROW=ON $TRAVIS_BUILD_DIR
-- export PARQUET_TEST_DATA=$TRAVIS_BUILD_DIR/data
 
 script:
 - $TRAVIS_BUILD_DIR/ci/travis_script_cpp.sh

http://git-wip-us.apache.org/repos/asf/parquet-cpp/blob/2154e873/CMakeLists.txt
----------------------------------------------------------------------
diff --git a/CMakeLists.txt b/CMakeLists.txt
index 79e43f5..ac9d515 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -52,13 +52,6 @@ if(APPLE)
   set(CMAKE_MACOSX_RPATH 1)
 endif()
 
-option(PARQUET_BUILD_SHARED
-    "Build the shared version of libparquet"
-    ON)
-option(PARQUET_BUILD_STATIC
-    "Build the static version of libparquet"
-    ON)
-
 # if no build build type is specified, default to debug builds
 if (NOT CMAKE_BUILD_TYPE)
   set(CMAKE_BUILD_TYPE Debug)
@@ -69,6 +62,14 @@ string (TOLOWER ${CMAKE_BUILD_TYPE} BUILD_SUBDIR_NAME)
 
 # Top level cmake file, set options
 if ("${CMAKE_SOURCE_DIR}" STREQUAL "${CMAKE_CURRENT_SOURCE_DIR}")
+  option(PARQUET_BUILD_SHARED
+    "Build the shared version of libparquet"
+    ON)
+  option(PARQUET_BUILD_STATIC
+    "Build the static version of libparquet. Always ON if building unit tests"
+    ON)
+  set(PARQUET_ARROW_LINKAGE "shared" CACHE STRING
+    "Libraries to link for Apache Arrow. static|shared (default shared)")
   option(PARQUET_USE_SSE
     "Build with SSE4 optimizations"
     OFF)
@@ -87,6 +88,13 @@ if ("${CMAKE_SOURCE_DIR}" STREQUAL "${CMAKE_CURRENT_SOURCE_DIR}")
   option(PARQUET_ARROW
     "Build the Arrow support"
     OFF)
+  option(PARQUET_ZLIB_VENDORED
+    "Build our own zlib (some libz.a aren't configured for static linking)"
+    ON)
+endif()
+
+if (PARQUET_BUILD_TESTS OR PARQUET_BUILD_EXECUTABLES OR PARQUET_BUILD_BENCHMARKS)
+  set(PARQUET_BUILD_STATIC ON)
 endif()
 
 # If build in-source, create the latest symlink. If build out-of-source, which is
@@ -106,6 +114,17 @@ else()
   set(BUILD_OUTPUT_ROOT_DIRECTORY "${CMAKE_CURRENT_BINARY_DIR}/${BUILD_SUBDIR_NAME}")
 endif()
 
+# where to put generated archives (.a files)
+set(CMAKE_ARCHIVE_OUTPUT_DIRECTORY "${BUILD_OUTPUT_ROOT_DIRECTORY}")
+set(ARCHIVE_OUTPUT_DIRECTORY "${BUILD_OUTPUT_ROOT_DIRECTORY}")
+
+# where to put generated libraries (.so files)
+set(CMAKE_LIBRARY_OUTPUT_DIRECTORY "${BUILD_OUTPUT_ROOT_DIRECTORY}")
+set(LIBRARY_OUTPUT_DIRECTORY "${BUILD_OUTPUT_ROOT_DIRECTORY}")
+
+# where to put generated binaries
+set(EXECUTABLE_OUTPUT_PATH "${BUILD_OUTPUT_ROOT_DIRECTORY}")
+
 ############################################################
 # Benchmarking
 ############################################################
@@ -133,6 +152,17 @@ function(ADD_PARQUET_BENCHMARK REL_BENCHMARK_NAME)
     # This benchmark has a corresponding .cc file, set it up as an executable.
     set(BENCHMARK_PATH "${EXECUTABLE_OUTPUT_PATH}/${BENCHMARK_NAME}")
     add_executable(${BENCHMARK_NAME} "${REL_BENCHMARK_NAME}.cc")
+
+    if(APPLE)
+      # On OS X / Thrift >= 0.9.2, tr1/tuple.h is not in libc++
+      SET_TARGET_PROPERTIES(${TEST_NAME} PROPERTIES COMPILE_FLAGS
+        -DGTEST_USE_OWN_TR1_TUPLE=1)
+    else()
+      # Linux, for Thrift >= 0.9.2
+      SET_TARGET_PROPERTIES(${TEST_NAME} PROPERTIES COMPILE_FLAGS
+        -DGTEST_USE_OWN_TR1_TUPLE=0)
+    endif()
+
     target_link_libraries(${BENCHMARK_NAME} ${PARQUET_BENCHMARK_LINK_LIBS})
     add_dependencies(runbenchmark ${BENCHMARK_NAME})
     set(NO_COLOR "--color_print=false")
@@ -194,12 +224,7 @@ function(ADD_PARQUET_TEST REL_TEST_NAME)
       set(TEST_LINK_LIBS ${PARQUET_TEST_SHARED_LINK_LIBS})
     endif()
   else()
-    if(NOT PARQUET_BUILD_STATIC)
-      # Skip this test if we are not building the static library
-      return()
-    else()
-      set(TEST_LINK_LIBS ${PARQUET_TEST_LINK_LIBS})
-    endif()
+    set(TEST_LINK_LIBS ${PARQUET_TEST_LINK_LIBS})
   endif()
 
   get_filename_component(TEST_NAME ${REL_TEST_NAME} NAME_WE)
@@ -268,316 +293,12 @@ enable_testing()
 # Dependencies
 ############################################################
 
-set(GTEST_VERSION "1.7.0")
-set(GBENCHMARK_VERSION "1.0.0")
-set(SNAPPY_VERSION "1.1.3")
-set(THRIFT_VERSION "0.9.1")
-
-# find boost headers and libs
-set(Boost_DEBUG TRUE)
-set(Boost_USE_MULTITHREADED ON)
-find_package(Boost REQUIRED)
-include_directories(SYSTEM ${Boost_INCLUDE_DIRS})
-set(LIBS ${LIBS} ${Boost_LIBRARIES})
-message(STATUS "Boost include dir: " ${Boost_INCLUDE_DIRS})
-message(STATUS "Boost libraries: " ${Boost_LIBRARIES})
-
-# find thrift headers and libs
-find_package(Thrift)
-
-if (NOT THRIFT_FOUND)
-  if (APPLE)
-      message(FATAL_ERROR "thrift compilation under OSX is not currently supported.")
-  endif()
-
-  set(THRIFT_PREFIX "${CMAKE_CURRENT_BINARY_DIR}/thrift_ep/src/thrift_ep-install")
-  set(THRIFT_HOME "${THRIFT_PREFIX}")
-  set(THRIFT_INCLUDE_DIR "${THRIFT_PREFIX}/include")
-  set(THRIFT_STATIC_LIB "${THRIFT_PREFIX}/lib/libthrift.a")
-  set(THRIFT_COMPILER "${THRIFT_PREFIX}/bin/thrift")
-  set(THRIFT_VENDORED 1)
-
-  if (CMAKE_VERSION VERSION_GREATER "3.2")
-    # BUILD_BYPRODUCTS is a 3.2+ feature
-    ExternalProject_Add(thrift_ep
-      CONFIGURE_COMMAND ./configure "CXXFLAGS=-fPIC" --without-qt4 --without-c_glib --without-csharp --without-java --without-erlang --without-nodejs --without-lua --without-python --without-perl --without-php --without-php_extension --without-ruby --without-haskell --without-go --without-d --with-cpp "--prefix=${THRIFT_PREFIX}"
-      BUILD_IN_SOURCE 1
-      # This is needed for 0.9.1 and can be removed for 0.9.3 again
-      BUILD_COMMAND make clean
-      INSTALL_COMMAND make install
-      INSTALL_DIR ${THRIFT_PREFIX}
-      URL "http://archive.apache.org/dist/thrift/${THRIFT_VERSION}/thrift-${THRIFT_VERSION}.tar.gz"
-      BUILD_BYPRODUCTS "${THRIFT_STATIC_LIB}" "${THRIFT_COMPILER}"
-      )
-  else()
-    ExternalProject_Add(thrift_ep
-      CONFIGURE_COMMAND ./configure "CXXFLAGS=-fPIC" --without-qt4 --without-c_glib --without-csharp --without-java --without-erlang --without-nodejs --without-lua --without-python --without-perl --without-php --without-php_extension --without-ruby --without-haskell --without-go --without-d --with-cpp "--prefix=${THRIFT_PREFIX}"
-      BUILD_IN_SOURCE 1
-      # This is needed for 0.9.1 and can be removed for 0.9.3 again
-      BUILD_COMMAND make clean
-      INSTALL_COMMAND make install
-      INSTALL_DIR ${THRIFT_PREFIX}
-      URL "http://archive.apache.org/dist/thrift/${THRIFT_VERSION}/thrift-${THRIFT_VERSION}.tar.gz"
-      )
-  endif()
-    set(THRIFT_VENDORED 1)
-else()
-    set(THRIFT_VENDORED 0)
-endif()
-
-include_directories(SYSTEM ${THRIFT_INCLUDE_DIR} ${THRIFT_INCLUDE_DIR}/thrift)
-message(STATUS "Thrift include dir: ${THRIFT_INCLUDE_DIR}")
-message(STATUS "Thrift static library: ${THRIFT_STATIC_LIB}")
-message(STATUS "Thrift compiler: ${THRIFT_COMPILER}")
-add_library(thriftstatic STATIC IMPORTED)
-set_target_properties(thriftstatic PROPERTIES IMPORTED_LOCATION ${THRIFT_STATIC_LIB})
-
-if (THRIFT_VENDORED)
-  add_dependencies(thriftstatic thrift_ep)
-endif()
-
-## Snappy
-find_package(Snappy)
-if (NOT SNAPPY_FOUND)
-  set(SNAPPY_PREFIX "${CMAKE_CURRENT_BINARY_DIR}/snappy_ep/src/snappy_ep-install")
-  set(SNAPPY_HOME "${SNAPPY_PREFIX}")
-  set(SNAPPY_INCLUDE_DIR "${SNAPPY_PREFIX}/include")
-  set(SNAPPY_STATIC_LIB "${SNAPPY_PREFIX}/lib/libsnappy.a")
-  set(SNAPPY_VENDORED 1)
-
-  if (CMAKE_VERSION VERSION_GREATER "3.2")
-    # BUILD_BYPRODUCTS is a 3.2+ feature
-    ExternalProject_Add(snappy_ep
-      CONFIGURE_COMMAND ./configure --with-pic "--prefix=${SNAPPY_PREFIX}"
-      BUILD_IN_SOURCE 1
-      BUILD_COMMAND ${MAKE}
-      INSTALL_DIR ${SNAPPY_PREFIX}
-      URL "https://github.com/google/snappy/releases/download/${SNAPPY_VERSION}/snappy-${SNAPPY_VERSION}.tar.gz"
-      BUILD_BYPRODUCTS "${SNAPPY_STATIC_LIB}"
-      )
-  else()
-    ExternalProject_Add(snappy_ep
-      CONFIGURE_COMMAND ./configure --with-pic "--prefix=${SNAPPY_PREFIX}"
-      BUILD_IN_SOURCE 1
-      BUILD_COMMAND ${MAKE}
-      INSTALL_DIR ${SNAPPY_PREFIX}
-      URL "https://github.com/google/snappy/releases/download/${SNAPPY_VERSION}/snappy-${SNAPPY_VERSION}.tar.gz"
-      )
-  endif()
-else()
-    set(SNAPPY_VENDORED 0)
-endif()
-
-include_directories(SYSTEM ${SNAPPY_INCLUDE_DIR})
-add_library(snappystatic STATIC IMPORTED)
-set_target_properties(snappystatic PROPERTIES IMPORTED_LOCATION ${SNAPPY_STATIC_LIB})
-
-if (SNAPPY_VENDORED)
-  add_dependencies(snappystatic snappy_ep)
-endif()
-
-## Brotli
-find_package(Brotli)
-if (NOT BROTLI_FOUND)
-  set(BROTLI_PREFIX "${CMAKE_CURRENT_BINARY_DIR}/brotli_ep/src/brotli_ep-install")
-  set(BROTLI_HOME "${BROTLI_PREFIX}")
-  set(BROTLI_INCLUDE_DIR "${BROTLI_PREFIX}/include")
-  set(BROTLI_LIBRARY_ENC "${BROTLI_PREFIX}/lib/${CMAKE_LIBRARY_ARCHITECTURE}/libbrotlienc.a")
-  set(BROTLI_LIBRARY_DEC "${BROTLI_PREFIX}/lib/${CMAKE_LIBRARY_ARCHITECTURE}/libbrotlidec.a")
-  set(BROTLI_LIBRARY_COMMON "${BROTLI_PREFIX}/lib/${CMAKE_LIBRARY_ARCHITECTURE}/libbrotlicommon.a")
-  set(BROTLI_VENDORED 1)
-  set(BROTLI_CMAKE_ARGS -DCMAKE_BUILD_TYPE=${CMAKE_BUILD_TYPE}
-                        -DCMAKE_INSTALL_PREFIX=${BROTLI_PREFIX}
-                        -DCMAKE_INSTALL_LIBDIR=lib/${CMAKE_LIBRARY_ARCHITECTURE}
-                        -DBUILD_SHARED_LIBS=OFF)
-
-  if (CMAKE_VERSION VERSION_GREATER "3.2")
-    # BUILD_BYPRODUCTS is a 3.2+ feature
-    ExternalProject_Add(brotli_ep
-      GIT_REPOSITORY https://github.com/google/brotli.git
-      GIT_TAG 5db62dcc9d386579609540cdf8869e95ad334bbd
-      BUILD_BYPRODUCTS "${BROTLI_LIBRARY_ENC}" "${BROTLI_LIBRARY_DEC}" "${BROTLI_LIBRARY_COMMON}"
-      CMAKE_ARGS ${BROTLI_CMAKE_ARGS})
-  else()
-    ExternalProject_Add(brotli_ep
-      GIT_REPOSITORY https://github.com/google/brotli.git
-      GIT_TAG 5db62dcc9d386579609540cdf8869e95ad334bbd
-      CMAKE_ARGS ${BROTLI_CMAKE_ARGS})
-  endif()
-else()
-  set(BROTLI_VENDORED 0)
-endif()
-
-include_directories(SYSTEM ${BROTLI_INCLUDE_DIR})
-add_library(brotlistatic_enc STATIC IMPORTED)
-set_target_properties(brotlistatic_enc PROPERTIES IMPORTED_LOCATION ${BROTLI_LIBRARY_ENC})
-add_library(brotlistatic_dec STATIC IMPORTED)
-set_target_properties(brotlistatic_dec PROPERTIES IMPORTED_LOCATION ${BROTLI_LIBRARY_DEC})
-add_library(brotlistatic_common STATIC IMPORTED)
-set_target_properties(brotlistatic_common PROPERTIES IMPORTED_LOCATION ${BROTLI_LIBRARY_COMMON})
-
-if (BROTLI_VENDORED)
-  add_dependencies(brotlistatic_enc brotli_ep)
-  add_dependencies(brotlistatic_dec brotli_ep)
-  add_dependencies(brotlistatic_common brotli_ep)
-endif()
-
-## ZLIB
-# For now: Always build zlib so that the shared lib has -fPIC
-# find_package(ZLIB)
-
-if (NOT ZLIB_FOUND)
-  set(ZLIB_PREFIX "${CMAKE_CURRENT_BINARY_DIR}/zlib_ep/src/zlib_ep-install")
-  set(ZLIB_HOME "${ZLIB_PREFIX}")
-  set(ZLIB_INCLUDE_DIR "${ZLIB_PREFIX}/include")
-  set(ZLIB_STATIC_LIB "${ZLIB_PREFIX}/lib/libz.a")
-  set(ZLIB_VENDORED 1)
-  set(ZLIB_CMAKE_ARGS -DCMAKE_BUILD_TYPE=${CMAKE_BUILD_TYPE}
-                      -DCMAKE_INSTALL_PREFIX=${ZLIB_PREFIX}
-                      -DCMAKE_C_FLAGS=-fPIC
-                      -DBUILD_SHARED_LIBS=OFF)
-
-  if (CMAKE_VERSION VERSION_GREATER "3.2")
-    # BUILD_BYPRODUCTS is a 3.2+ feature
-    ExternalProject_Add(zlib_ep
-      URL "http://zlib.net/zlib-1.2.8.tar.gz"
-      BUILD_BYPRODUCTS "${ZLIB_STATIC_LIB}"
-      CMAKE_ARGS ${ZLIB_CMAKE_ARGS})
-  else()
-    ExternalProject_Add(zlib_ep
-      URL "http://zlib.net/zlib-1.2.8.tar.gz"
-      CMAKE_ARGS ${ZLIB_CMAKE_ARGS})
-  endif()
-else()
-    set(ZLIB_VENDORED 0)
-endif()
-
-include_directories(SYSTEM ${ZLIB_INCLUDE_DIRS})
-add_library(zlibstatic STATIC IMPORTED)
-set_target_properties(zlibstatic PROPERTIES IMPORTED_LOCATION ${ZLIB_STATIC_LIB})
-
-if (ZLIB_VENDORED)
-  add_dependencies(zlibstatic zlib_ep)
-endif()
-
-## GTest
-if(PARQUET_BUILD_TESTS)
-  add_custom_target(unittest ctest -L unittest)
-
-  if("$ENV{GTEST_HOME}" STREQUAL "")
-    if(APPLE)
-      set(GTEST_CMAKE_CXX_FLAGS "-fPIC -std=c++11 -stdlib=libc++ -DGTEST_USE_OWN_TR1_TUPLE=1 -Wno-unused-value -Wno-ignored-attributes")
-    else()
-      set(GTEST_CMAKE_CXX_FLAGS "-fPIC")
-    endif()
-
-    set(GTEST_PREFIX "${CMAKE_CURRENT_BINARY_DIR}/googletest_ep-prefix/src/googletest_ep")
-    set(GTEST_INCLUDE_DIR "${GTEST_PREFIX}/include")
-    set(GTEST_STATIC_LIB "${GTEST_PREFIX}/${CMAKE_CFG_INTDIR}/${CMAKE_STATIC_LIBRARY_PREFIX}gtest${CMAKE_STATIC_LIBRARY_SUFFIX}")
-    set(GTEST_VENDORED 1)
-
-    if (CMAKE_VERSION VERSION_GREATER "3.2")
-      # BUILD_BYPRODUCTS is a 3.2+ feature
-      ExternalProject_Add(googletest_ep
-        URL "https://github.com/google/googletest/archive/release-${GTEST_VERSION}.tar.gz"
-        CMAKE_ARGS -DCMAKE_CXX_FLAGS=${GTEST_CMAKE_CXX_FLAGS} -Dgtest_force_shared_crt=ON
-        # googletest doesn't define install rules, so just build in the
-        # source dir and don't try to install.  See its README for
-        # details.
-        BUILD_IN_SOURCE 1
-        BUILD_BYPRODUCTS "${GTEST_STATIC_LIB}"
-        INSTALL_COMMAND "")
-    else()
-      ExternalProject_Add(googletest_ep
-        URL "https://github.com/google/googletest/archive/release-${GTEST_VERSION}.tar.gz"
-        CMAKE_ARGS -DCMAKE_CXX_FLAGS=${GTEST_CMAKE_CXX_FLAGS} -Dgtest_force_shared_crt=ON
-        # googletest doesn't define install rules, so just build in the
-        # source dir and don't try to install.  See its README for
-        # details.
-        BUILD_IN_SOURCE 1
-        INSTALL_COMMAND "")
-    endif()
-  else()
-    find_package(GTest REQUIRED)
-    set(GTEST_VENDORED 0)
-  endif()
-
-  message(STATUS "GTest include dir: ${GTEST_INCLUDE_DIR}")
-  message(STATUS "GTest static library: ${GTEST_STATIC_LIB}")
-  include_directories(SYSTEM ${GTEST_INCLUDE_DIR})
-  add_library(gtest STATIC IMPORTED)
-  set_target_properties(gtest PROPERTIES IMPORTED_LOCATION ${GTEST_STATIC_LIB})
-
-  if(GTEST_VENDORED)
-    add_dependencies(gtest googletest_ep)
-  endif()
-endif()
-
-## Google Benchmark
-if ("$ENV{GBENCHMARK_HOME}" STREQUAL "")
-  set(GBENCHMARK_HOME ${THIRDPARTY_DIR}/installed)
-endif()
-
-if(PARQUET_BUILD_BENCHMARKS)
-  add_custom_target(runbenchmark ctest -L benchmark)
-
-  if("$ENV{GBENCHMARK_HOME}" STREQUAL "")
-    if(APPLE)
-      set(GBENCHMARK_CMAKE_CXX_FLAGS "-std=c++11 -stdlib=libc++")
-    else()
-      set(GBENCHMARK_CMAKE_CXX_FLAGS "--std=c++11")
-    endif()
-
-    set(GBENCHMARK_PREFIX "${CMAKE_CURRENT_BINARY_DIR}/gbenchmark_ep/src/gbenchmark_ep-install")
-    set(GBENCHMARK_INCLUDE_DIR "${GBENCHMARK_PREFIX}/include")
-    set(GBENCHMARK_STATIC_LIB "${GBENCHMARK_PREFIX}/lib/${CMAKE_STATIC_LIBRARY_PREFIX}benchmark${CMAKE_STATIC_LIBRARY_SUFFIX}")
-    set(GBENCHMARK_VENDORED 1)
-    set(GBENCHMARK_CMAKE_ARGS
-          "-DCMAKE_BUILD_TYPE=Release"
-          "-DCMAKE_INSTALL_PREFIX:PATH=${GBENCHMARK_PREFIX}"
-          "-DCMAKE_CXX_FLAGS=-fPIC ${GBENCHMARK_CMAKE_CXX_FLAGS}")
-    if (CMAKE_VERSION VERSION_GREATER "3.2")
-      # BUILD_BYPRODUCTS is a 3.2+ feature
-      ExternalProject_Add(gbenchmark_ep
-        URL "https://github.com/google/benchmark/archive/v${GBENCHMARK_VERSION}.tar.gz"
-        BUILD_BYPRODUCTS "${GBENCHMARK_STATIC_LIB}"
-        CMAKE_ARGS ${GBENCHMARK_CMAKE_ARGS})
-    else()
-      ExternalProject_Add(gbenchmark_ep
-        URL "https://github.com/google/benchmark/archive/v${GBENCHMARK_VERSION}.tar.gz"
-        CMAKE_ARGS ${GBENCHMARK_CMAKE_ARGS})
-    endif()
-  else()
-    find_package(GBenchmark REQUIRED)
-    set(GBENCHMARK_VENDORED 0)
-  endif()
-
-  message(STATUS "GBenchmark include dir: ${GBENCHMARK_INCLUDE_DIR}")
-  message(STATUS "GBenchmark static library: ${GBENCHMARK_STATIC_LIB}")
-  include_directories(SYSTEM ${GBENCHMARK_INCLUDE_DIR})
-  add_library(gbenchmark STATIC IMPORTED)
-  set_target_properties(gbenchmark PROPERTIES IMPORTED_LOCATION ${GBENCHMARK_STATIC_LIB})
-
-  if(GBENCHMARK_VENDORED)
-    add_dependencies(gbenchmark gbenchmark_ep)
-  endif()
-endif()
+include(ThirdpartyToolchain)
 
 # Thrift requires these definitions for some types that we use
 add_definitions(-DHAVE_INTTYPES_H -DHAVE_NETINET_IN_H -DHAVE_NETDB_H)
 add_definitions(-fPIC)
 
-# where to put generated archives (.a files)
-set(CMAKE_ARCHIVE_OUTPUT_DIRECTORY "${BUILD_OUTPUT_ROOT_DIRECTORY}")
-set(ARCHIVE_OUTPUT_DIRECTORY "${BUILD_OUTPUT_ROOT_DIRECTORY}")
-
-# where to put generated libraries (.so files)
-set(CMAKE_LIBRARY_OUTPUT_DIRECTORY "${BUILD_OUTPUT_ROOT_DIRECTORY}")
-set(LIBRARY_OUTPUT_DIRECTORY "${BUILD_OUTPUT_ROOT_DIRECTORY}")
-
-# where to put generated binaries
-set(EXECUTABLE_OUTPUT_PATH "${BUILD_OUTPUT_ROOT_DIRECTORY}")
-
 #############################################################
 # Compiler flags and release types
 
@@ -692,12 +413,47 @@ if (${CLANG_TIDY_FOUND})
 endif()
 
 #############################################################
+# Code coverage
+
+# Adapted from Apache Kudu (incubating)
+if ("${PARQUET_GENERATE_COVERAGE}")
+  if("${CMAKE_CXX_COMPILER}" MATCHES ".*clang.*")
+    # There appears to be some bugs in clang 3.3 which cause code coverage
+    # to have link errors, not locating the llvm_gcda_* symbols.
+    # This should be fixed in llvm 3.4 with http://llvm.org/viewvc/llvm-project?view=revision&revision=184666
+    message(SEND_ERROR "Cannot currently generate coverage with clang")
+  endif()
+  message(STATUS "Configuring build for gcov")
+  set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} --coverage")
+  # For coverage to work properly, we need to use static linkage. Otherwise,
+  # __gcov_flush() doesn't properly flush coverage from every module.
+  # See http://stackoverflow.com/questions/28164543/using-gcov-flush-within-a-library-doesnt-force-the-other-modules-to-yield-gc
+  if(NOT PARQUET_BUILD_STATIC)
+    message(SEND_ERROR "Coverage requires the static lib to be built")
+  endif()
+endif()
+
+#############################################################
+# Apache Arrow linkage
+
+if ("${PARQUET_ARROW_LINKAGE}" STREQUAL "shared")
+  set(ARROW_LINK_LIBS
+    arrow
+    arrow_io)
+else()
+  set(ARROW_LINK_LIBS
+    arrow_static
+    arrow_io_static)
+endif()
+
+#############################################################
 # Test linking
 
 set(PARQUET_MIN_TEST_LIBS
   parquet_test_main)
 
 set(PARQUET_TEST_LINK_LIBS ${PARQUET_MIN_TEST_LIBS}
+  ${ARROW_LINK_LIBS}
   parquet_static)
 
 set(PARQUET_TEST_SHARED_LINK_LIBS ${PARQUET_MIN_TEST_LIBS}
@@ -716,27 +472,6 @@ else()
       parquet_shared)
 endif()
 
-#############################################################
-# Code coverage
-
-# Adapted from Apache Kudu (incubating)
-if ("${PARQUET_GENERATE_COVERAGE}")
-  if("${CMAKE_CXX_COMPILER}" MATCHES ".*clang.*")
-    # There appears to be some bugs in clang 3.3 which cause code coverage
-    # to have link errors, not locating the llvm_gcda_* symbols.
-    # This should be fixed in llvm 3.4 with http://llvm.org/viewvc/llvm-project?view=revision&revision=184666
-    message(SEND_ERROR "Cannot currently generate coverage with clang")
-  endif()
-  message(STATUS "Configuring build for gcov")
-  set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} --coverage")
-  # For coverage to work properly, we need to use static linkage. Otherwise,
-  # __gcov_flush() doesn't properly flush coverage from every module.
-  # See http://stackoverflow.com/questions/28164543/using-gcov-flush-within-a-library-doesnt-force-the-other-modules-to-yield-gc
-  if(NOT PARQUET_BUILD_STATIC)
-    message(SEND_ERROR "Coverage requires the static lib to be built")
-  endif()
-endif()
-
 ############################################################
 # Library config
 
@@ -767,18 +502,14 @@ set(LIBPARQUET_SRCS
   src/parquet/schema/printer.cc
   src/parquet/schema/types.cc
 
-  src/parquet/util/buffer.cc
   src/parquet/util/cpu-info.cc
-  src/parquet/util/input.cc
-  src/parquet/util/mem-allocator.cc
-  src/parquet/util/mem-pool.cc
-  src/parquet/util/output.cc
+  src/parquet/util/memory.cc
 )
 
 set(LIBPARQUET_LINK_LIBS
 )
 
-set(LIBPARQUET_PRIVATE_LINK_LIBS
+set(BUNDLED_STATIC_LIBS
   parquet_thrift
   brotlistatic_dec
   brotlistatic_enc
@@ -788,13 +519,21 @@ set(LIBPARQUET_PRIVATE_LINK_LIBS
   zlibstatic
 )
 
+# Shared library linked libs
+set(LIBPARQUET_PRIVATE_LINK_LIBS
+  ${ARROW_LINK_LIBS}
+  ${BUNDLED_STATIC_LIBS}
+)
+
 add_library(parquet_objlib OBJECT
   ${LIBPARQUET_SRCS}
 )
 
 # Although we don't link parquet_objlib against anything, we need it to depend
 # on these libs as we may generate their headers via ExternalProject_Add
-add_dependencies(parquet_objlib ${LIBPARQUET_PRIVATE_LINK_LIBS})
+add_dependencies(parquet_objlib
+  ${LIBPARQUET_LINK_LIBS}
+  ${LIBPARQUET_PRIVATE_LINK_LIBS})
 
 set_property(TARGET parquet_objlib PROPERTY POSITION_INDEPENDENT_CODE 1)
 
@@ -854,67 +593,8 @@ add_subdirectory(benchmarks)
 add_subdirectory(examples)
 add_subdirectory(tools)
 
-# Arrow
+# Arrow adapter
 if (PARQUET_ARROW)
-
-  set(ARROW_VERSION "4733ee876e1fddb8032fce1dc9e486d68904fbea")
-
-  find_package(Arrow)
-
-  if (NOT ARROW_FOUND)
-    set(ARROW_PREFIX "${CMAKE_CURRENT_BINARY_DIR}/arrow_ep/src/arrow_ep-install")
-    set(ARROW_HOME "${ARROW_PREFIX}")
-    set(ARROW_INCLUDE_DIR "${ARROW_PREFIX}/include")
-    set(ARROW_SHARED_LIB "${ARROW_PREFIX}/lib/libarrow${CMAKE_SHARED_LIBRARY_SUFFIX}")
-    set(ARROW_IO_SHARED_LIB "${ARROW_PREFIX}/lib/libarrow_io${CMAKE_SHARED_LIBRARY_SUFFIX}")
-    set(ARROW_STATIC_LIB "${ARROW_PREFIX}/lib/libarrow.a")
-    set(ARROW_IO_STATIC_LIB "${ARROW_PREFIX}/lib/libarrow_io.a")
-    set(ARROW_CMAKE_ARGS -DCMAKE_BUILD_TYPE=${CMAKE_BUILD_TYPE}
-                         -DCMAKE_INSTALL_PREFIX=${ARROW_PREFIX}
-                         -DARROW_BUILD_TESTS=OFF
-                         -DARROW_HDFS=ON)
-
-    if (CMAKE_VERSION VERSION_GREATER "3.2")
-      # BUILD_BYPRODUCTS is a 3.2+ feature
-      ExternalProject_Add(arrow_ep
-        GIT_REPOSITORY https://github.com/apache/arrow.git
-        GIT_TAG ${ARROW_VERSION}
-        BUILD_BYPRODUCTS "${ARROW_SHARED_LIB}" "${ARROW_IO_SHARED_LIB}" "${ARROW_IO_STATIC_LIB}" "${ARROW_STATIC_LIB}"
-        # With CMake 3.7.0 there is a SOURCE_SUBDIR argument which we can use
-        # to specify that the CMakeLists.txt of Arrow is located in cpp/
-        #
-        # See https://gitlab.kitware.com/cmake/cmake/commit/a8345d65f359d75efb057d22976cfb92b4d477cf
-        CONFIGURE_COMMAND "${CMAKE_COMMAND}" ${ARROW_CMAKE_ARGS} ${CMAKE_CURRENT_BINARY_DIR}/arrow_ep-prefix/src/arrow_ep/cpp
-        CMAKE_ARGS ${ARROW_CMAKE_ARGS})
-    else()
-        ExternalProject_Add(arrow_ep
-        GIT_REPOSITORY https://github.com/apache/arrow.git
-        GIT_TAG ${ARROW_VERSION}
-        CONFIGURE_COMMAND "${CMAKE_COMMAND}" ${ARROW_CMAKE_ARGS} ${CMAKE_CURRENT_BINARY_DIR}/arrow_ep-prefix/src/arrow_ep/cpp
-        CMAKE_ARGS ${ARROW_CMAKE_ARGS})
-    endif()
-    set(ARROW_VENDORED 1)
-  else()
-    set(ARROW_VENDORED 0)
-  endif()
-
-  include_directories(SYSTEM ${ARROW_INCLUDE_DIR})
-  add_library(arrow SHARED IMPORTED)
-  set_target_properties(arrow PROPERTIES IMPORTED_LOCATION ${ARROW_SHARED_LIB})
-  add_library(arrow_io SHARED IMPORTED)
-  set_target_properties(arrow_io PROPERTIES IMPORTED_LOCATION ${ARROW_IO_SHARED_LIB})
-  add_library(arrow_static STATIC IMPORTED)
-  set_target_properties(arrow_static PROPERTIES IMPORTED_LOCATION ${ARROW_STATIC_LIB})
-  add_library(arrow_io_static STATIC IMPORTED)
-  set_target_properties(arrow_io_static PROPERTIES IMPORTED_LOCATION ${ARROW_IO_STATIC_LIB})
-
-  if (ARROW_VENDORED)
-    add_dependencies(arrow arrow_ep)
-    add_dependencies(arrow_io arrow_ep)
-    add_dependencies(arrow_static arrow_ep)
-    add_dependencies(arrow_io_static arrow_ep)
-  endif()
-
   add_subdirectory(src/parquet/arrow)
 endif()
 

http://git-wip-us.apache.org/repos/asf/parquet-cpp/blob/2154e873/README.md
----------------------------------------------------------------------
diff --git a/README.md b/README.md
index 54613c1..6970d95 100644
--- a/README.md
+++ b/README.md
@@ -33,14 +33,15 @@
 
 ## Third Party Dependencies
 
+- Apache Arrow (memory management, built-in IO, optional Array adapters)
 - snappy
 - zlib
-- thrift 0.7+ [install instructions](https://thrift.apache.org/docs/install/)
+- Thrift 0.7+ [install instructions](https://thrift.apache.org/docs/install/)
 - googletest 1.7.0 (cannot be installed with package managers)
 - Google Benchmark (only required if building benchmarks)
 
-You can either install these dependencies via your package manager, otherwise
-they will be build automatically as part of the build.
+You can either install these dependencies separately, otherwise they will be
+built automatically as part of the build.
 
 Note that thrift will not be build inside the project on macOS. Instead you
 should install it via homebrew:
@@ -53,10 +54,17 @@ brew install thrift
 
 - `cmake .`
 
-  - You can customize dependent library locations through various environment variables:
-    - THRIFT_HOME customizes the thrift installed location.
-    - SNAPPY_HOME customizes the snappy installed location.
+  - You can customize build dependency locations through various environment variables:
+    - ARROW_HOME customizes the Apache Arrow installed location.
+    - THRIFT_HOME customizes the Apache Thrift (C++ libraries and compiler
+      installed location.
+    - SNAPPY_HOME customizes the Snappy installed location.
     - ZLIB_HOME customizes the zlib installed location.
+    - BROTLI_HOME customizes the Brotli installed location.
+    - GTEST_HOME customizes the googletest installed location (if you are
+      building the unit tests).
+    - GBENCHMARK_HOME customizes the Google Benchmark installed location (if
+      you are building the benchmarks).
 
 - `make`
 
@@ -71,6 +79,13 @@ For release-level builds (enable optimizations and disable debugging), pass
 
 Incremental builds can be done afterwords with just `make`.
 
+## Using with Apache Arrow
+
+Arrow provides some of the memory management and IO interfaces that we use in
+parquet-cpp. By default, Parquet links to Arrow's shared libraries. If you wish
+to statically-link the Arrow symbols instead, pass
+`-DPARQUET_ARROW_LINKAGE=static`.
+
 ## Testing
 
 This library uses Google's `googletest` unit test framework. After building
@@ -87,9 +102,6 @@ to the `data` directory in the source checkout, for example:
 export PARQUET_TEST_DATA=`pwd`/data
 ```
 
-If you run `source setup_build_env.sh` it will set this variable automatically,
-but you may also wish to put it in your `.bashrc` or somewhere else.
-
 See `ctest --help` for configuration details about ctest. On GNU/Linux systems,
 you can use valgrind with ctest to look for memory leaks:
 

http://git-wip-us.apache.org/repos/asf/parquet-cpp/blob/2154e873/build-support/run-test.sh
----------------------------------------------------------------------
diff --git a/build-support/run-test.sh b/build-support/run-test.sh
index 7c3b570..e8f39cd 100755
--- a/build-support/run-test.sh
+++ b/build-support/run-test.sh
@@ -108,8 +108,9 @@ function run_test() {
   rm -f $XMLFILE
 
   $TEST_EXECUTABLE "$@" 2>&1 \
+    | c++filt \
     | $ROOT/build-support/stacktrace_addr2line.pl $TEST_EXECUTABLE \
-    | $pipe_cmd > $LOGFILE
+    | $pipe_cmd 2>&1 | tee $LOGFILE
   STATUS=$?
 
   # TSAN doesn't always exit with a non-zero exit code due to a bug:
@@ -186,7 +187,7 @@ for ATTEMPT_NUMBER in $(seq 1 $TEST_EXECUTION_ATTEMPTS) ; do
     done
   fi
   echo "Running $TEST_NAME, redirecting output into $LOGFILE" \
-    "(attempt ${ATTEMPT_NUMBER}/$TEST_EXECUTION_ATTEMPTS)"
+       "(attempt ${ATTEMPT_NUMBER}/$TEST_EXECUTION_ATTEMPTS)"
   if [ $RUN_TYPE = "test" ]; then
     run_test $*
   else

http://git-wip-us.apache.org/repos/asf/parquet-cpp/blob/2154e873/ci/before_script_travis.sh
----------------------------------------------------------------------
diff --git a/ci/before_script_travis.sh b/ci/before_script_travis.sh
index 75679fb..2bd880b 100755
--- a/ci/before_script_travis.sh
+++ b/ci/before_script_travis.sh
@@ -22,3 +22,18 @@ else
 fi
 
 export PARQUET_TEST_DATA=$TRAVIS_BUILD_DIR/data
+
+if [ $TRAVIS_OS_NAME == "linux" ]; then
+    cmake -DCMAKE_CXX_FLAGS="-Werror" \
+          -DPARQUET_TEST_MEMCHECK=ON \
+          -DPARQUET_BUILD_BENCHMARKS=ON \
+          -DPARQUET_ARROW=ON \
+          -DPARQUET_ARROW_LINKAGE=static \
+          -DPARQUET_GENERATE_COVERAGE=1 \
+          $TRAVIS_BUILD_DIR
+else
+    cmake -DCMAKE_CXX_FLAGS="-Werror" \
+          -DPARQUET_ARROW=ON \
+          -DPARQUET_ARROW_LINKAGE=static \
+          $TRAVIS_BUILD_DIR
+fi

http://git-wip-us.apache.org/repos/asf/parquet-cpp/blob/2154e873/cmake_modules/ThirdpartyToolchain.cmake
----------------------------------------------------------------------
diff --git a/cmake_modules/ThirdpartyToolchain.cmake b/cmake_modules/ThirdpartyToolchain.cmake
new file mode 100644
index 0000000..ea0b583
--- /dev/null
+++ b/cmake_modules/ThirdpartyToolchain.cmake
@@ -0,0 +1,369 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+set(GTEST_VERSION "1.7.0")
+set(GBENCHMARK_VERSION "1.0.0")
+set(SNAPPY_VERSION "1.1.3")
+set(THRIFT_VERSION "0.9.1")
+
+# Brotli 0.5.2 does not install headers/libraries yet, but 0.6.0.dev does
+set(BROTLI_VERSION "5db62dcc9d386579609540cdf8869e95ad334bbd")
+set(ARROW_VERSION "e15c6a0b3c05b5b42c204f34369d127182450ca0")
+
+# find boost headers and libs
+set(Boost_DEBUG TRUE)
+set(Boost_USE_MULTITHREADED ON)
+find_package(Boost REQUIRED)
+include_directories(SYSTEM ${Boost_INCLUDE_DIRS})
+set(LIBS ${LIBS} ${Boost_LIBRARIES})
+message(STATUS "Boost include dir: " ${Boost_INCLUDE_DIRS})
+message(STATUS "Boost libraries: " ${Boost_LIBRARIES})
+
+# find thrift headers and libs
+find_package(Thrift)
+
+if (NOT THRIFT_FOUND)
+  if (APPLE)
+      message(FATAL_ERROR "thrift compilation under OSX is not currently supported.")
+  endif()
+
+  set(THRIFT_PREFIX "${CMAKE_CURRENT_BINARY_DIR}/thrift_ep/src/thrift_ep-install")
+  set(THRIFT_HOME "${THRIFT_PREFIX}")
+  set(THRIFT_INCLUDE_DIR "${THRIFT_PREFIX}/include")
+  set(THRIFT_STATIC_LIB "${THRIFT_PREFIX}/lib/libthrift.a")
+  set(THRIFT_COMPILER "${THRIFT_PREFIX}/bin/thrift")
+  set(THRIFT_VENDORED 1)
+
+  if (CMAKE_VERSION VERSION_GREATER "3.2")
+    # BUILD_BYPRODUCTS is a 3.2+ feature
+    ExternalProject_Add(thrift_ep
+      CONFIGURE_COMMAND ./configure "CXXFLAGS=-fPIC" --without-qt4 --without-c_glib --without-csharp --without-java --without-erlang --without-nodejs --without-lua --without-python --without-perl --without-php --without-php_extension --without-ruby --without-haskell --without-go --without-d --with-cpp "--prefix=${THRIFT_PREFIX}"
+      BUILD_IN_SOURCE 1
+      # This is needed for 0.9.1 and can be removed for 0.9.3 again
+      BUILD_COMMAND make clean
+      INSTALL_COMMAND make install
+      INSTALL_DIR ${THRIFT_PREFIX}
+      URL "http://archive.apache.org/dist/thrift/${THRIFT_VERSION}/thrift-${THRIFT_VERSION}.tar.gz"
+      BUILD_BYPRODUCTS "${THRIFT_STATIC_LIB}" "${THRIFT_COMPILER}"
+      )
+  else()
+    ExternalProject_Add(thrift_ep
+      CONFIGURE_COMMAND ./configure "CXXFLAGS=-fPIC" --without-qt4 --without-c_glib --without-csharp --without-java --without-erlang --without-nodejs --without-lua --without-python --without-perl --without-php --without-php_extension --without-ruby --without-haskell --without-go --without-d --with-cpp "--prefix=${THRIFT_PREFIX}"
+      BUILD_IN_SOURCE 1
+      # This is needed for 0.9.1 and can be removed for 0.9.3 again
+      BUILD_COMMAND make clean
+      INSTALL_COMMAND make install
+      INSTALL_DIR ${THRIFT_PREFIX}
+      URL "http://archive.apache.org/dist/thrift/${THRIFT_VERSION}/thrift-${THRIFT_VERSION}.tar.gz"
+      )
+  endif()
+    set(THRIFT_VENDORED 1)
+else()
+    set(THRIFT_VENDORED 0)
+endif()
+
+include_directories(SYSTEM ${THRIFT_INCLUDE_DIR} ${THRIFT_INCLUDE_DIR}/thrift)
+message(STATUS "Thrift include dir: ${THRIFT_INCLUDE_DIR}")
+message(STATUS "Thrift static library: ${THRIFT_STATIC_LIB}")
+message(STATUS "Thrift compiler: ${THRIFT_COMPILER}")
+add_library(thriftstatic STATIC IMPORTED)
+set_target_properties(thriftstatic PROPERTIES IMPORTED_LOCATION ${THRIFT_STATIC_LIB})
+
+if (THRIFT_VENDORED)
+  add_dependencies(thriftstatic thrift_ep)
+endif()
+
+## Snappy
+find_package(Snappy)
+if (NOT SNAPPY_FOUND)
+  set(SNAPPY_PREFIX "${CMAKE_CURRENT_BINARY_DIR}/snappy_ep/src/snappy_ep-install")
+  set(SNAPPY_HOME "${SNAPPY_PREFIX}")
+  set(SNAPPY_INCLUDE_DIR "${SNAPPY_PREFIX}/include")
+  set(SNAPPY_STATIC_LIB "${SNAPPY_PREFIX}/lib/libsnappy.a")
+  set(SNAPPY_VENDORED 1)
+
+  if (CMAKE_VERSION VERSION_GREATER "3.2")
+    # BUILD_BYPRODUCTS is a 3.2+ feature
+    ExternalProject_Add(snappy_ep
+      CONFIGURE_COMMAND ./configure --with-pic "--prefix=${SNAPPY_PREFIX}"
+      BUILD_IN_SOURCE 1
+      BUILD_COMMAND ${MAKE}
+      INSTALL_DIR ${SNAPPY_PREFIX}
+      URL "https://github.com/google/snappy/releases/download/${SNAPPY_VERSION}/snappy-${SNAPPY_VERSION}.tar.gz"
+      BUILD_BYPRODUCTS "${SNAPPY_STATIC_LIB}"
+      )
+  else()
+    ExternalProject_Add(snappy_ep
+      CONFIGURE_COMMAND ./configure --with-pic "--prefix=${SNAPPY_PREFIX}"
+      BUILD_IN_SOURCE 1
+      BUILD_COMMAND ${MAKE}
+      INSTALL_DIR ${SNAPPY_PREFIX}
+      URL "https://github.com/google/snappy/releases/download/${SNAPPY_VERSION}/snappy-${SNAPPY_VERSION}.tar.gz"
+      )
+  endif()
+else()
+    set(SNAPPY_VENDORED 0)
+endif()
+
+include_directories(SYSTEM ${SNAPPY_INCLUDE_DIR})
+add_library(snappystatic STATIC IMPORTED)
+set_target_properties(snappystatic PROPERTIES IMPORTED_LOCATION ${SNAPPY_STATIC_LIB})
+
+if (SNAPPY_VENDORED)
+  add_dependencies(snappystatic snappy_ep)
+endif()
+
+## Brotli
+find_package(Brotli)
+if (NOT BROTLI_FOUND)
+  set(BROTLI_PREFIX "${CMAKE_CURRENT_BINARY_DIR}/brotli_ep/src/brotli_ep-install")
+  set(BROTLI_HOME "${BROTLI_PREFIX}")
+  set(BROTLI_INCLUDE_DIR "${BROTLI_PREFIX}/include")
+  set(BROTLI_LIBRARY_ENC "${BROTLI_PREFIX}/lib/${CMAKE_LIBRARY_ARCHITECTURE}/libbrotlienc.a")
+  set(BROTLI_LIBRARY_DEC "${BROTLI_PREFIX}/lib/${CMAKE_LIBRARY_ARCHITECTURE}/libbrotlidec.a")
+  set(BROTLI_LIBRARY_COMMON "${BROTLI_PREFIX}/lib/${CMAKE_LIBRARY_ARCHITECTURE}/libbrotlicommon.a")
+  set(BROTLI_VENDORED 1)
+  set(BROTLI_CMAKE_ARGS -DCMAKE_BUILD_TYPE=${CMAKE_BUILD_TYPE}
+                        -DCMAKE_INSTALL_PREFIX=${BROTLI_PREFIX}
+                        -DCMAKE_INSTALL_LIBDIR=lib/${CMAKE_LIBRARY_ARCHITECTURE}
+                        -DBUILD_SHARED_LIBS=OFF)
+
+  if (CMAKE_VERSION VERSION_GREATER "3.2")
+    # BUILD_BYPRODUCTS is a 3.2+ feature
+    ExternalProject_Add(brotli_ep
+      URL "https://github.com/google/brotli/archive/${BROTLI_VERSION}.tar.gz"
+      BUILD_BYPRODUCTS "${BROTLI_LIBRARY_ENC}" "${BROTLI_LIBRARY_DEC}" "${BROTLI_LIBRARY_COMMON}"
+      CMAKE_ARGS ${BROTLI_CMAKE_ARGS})
+  else()
+    ExternalProject_Add(brotli_ep
+      URL "https://github.com/google/brotli/archive/${BROTLI_VERSION}.tar.gz"
+      CMAKE_ARGS ${BROTLI_CMAKE_ARGS})
+  endif()
+else()
+  set(BROTLI_VENDORED 0)
+endif()
+
+include_directories(SYSTEM ${BROTLI_INCLUDE_DIR})
+add_library(brotlistatic_enc STATIC IMPORTED)
+set_target_properties(brotlistatic_enc PROPERTIES IMPORTED_LOCATION ${BROTLI_LIBRARY_ENC})
+add_library(brotlistatic_dec STATIC IMPORTED)
+set_target_properties(brotlistatic_dec PROPERTIES IMPORTED_LOCATION ${BROTLI_LIBRARY_DEC})
+add_library(brotlistatic_common STATIC IMPORTED)
+set_target_properties(brotlistatic_common PROPERTIES IMPORTED_LOCATION ${BROTLI_LIBRARY_COMMON})
+
+if (BROTLI_VENDORED)
+  add_dependencies(brotlistatic_enc brotli_ep)
+  add_dependencies(brotlistatic_dec brotli_ep)
+  add_dependencies(brotlistatic_common brotli_ep)
+endif()
+
+## ZLIB
+if (NOT PARQUET_ZLIB_VENDORED)
+  find_package(ZLIB)
+endif()
+
+if (NOT ZLIB_FOUND)
+  set(ZLIB_PREFIX "${CMAKE_CURRENT_BINARY_DIR}/zlib_ep/src/zlib_ep-install")
+  set(ZLIB_HOME "${ZLIB_PREFIX}")
+  set(ZLIB_INCLUDE_DIR "${ZLIB_PREFIX}/include")
+  set(ZLIB_STATIC_LIB "${ZLIB_PREFIX}/lib/libz.a")
+  set(ZLIB_VENDORED 1)
+  set(ZLIB_CMAKE_ARGS -DCMAKE_BUILD_TYPE=${CMAKE_BUILD_TYPE}
+                      -DCMAKE_INSTALL_PREFIX=${ZLIB_PREFIX}
+                      -DCMAKE_C_FLAGS=-fPIC
+                      -DBUILD_SHARED_LIBS=OFF)
+
+  if (CMAKE_VERSION VERSION_GREATER "3.2")
+    # BUILD_BYPRODUCTS is a 3.2+ feature
+    ExternalProject_Add(zlib_ep
+      URL "http://zlib.net/zlib-1.2.8.tar.gz"
+      BUILD_BYPRODUCTS "${ZLIB_STATIC_LIB}"
+      CMAKE_ARGS ${ZLIB_CMAKE_ARGS})
+  else()
+    ExternalProject_Add(zlib_ep
+      URL "http://zlib.net/zlib-1.2.8.tar.gz"
+      CMAKE_ARGS ${ZLIB_CMAKE_ARGS})
+  endif()
+else()
+    set(ZLIB_VENDORED 0)
+endif()
+
+include_directories(SYSTEM ${ZLIB_INCLUDE_DIRS})
+add_library(zlibstatic STATIC IMPORTED)
+set_target_properties(zlibstatic PROPERTIES IMPORTED_LOCATION ${ZLIB_STATIC_LIB})
+
+if (ZLIB_VENDORED)
+  add_dependencies(zlibstatic zlib_ep)
+endif()
+
+## GTest
+if(PARQUET_BUILD_TESTS)
+  add_custom_target(unittest ctest -L unittest)
+
+  if("$ENV{GTEST_HOME}" STREQUAL "")
+    if(APPLE)
+      set(GTEST_CMAKE_CXX_FLAGS "-fPIC -std=c++11 -stdlib=libc++ -DGTEST_USE_OWN_TR1_TUPLE=1 -Wno-unused-value -Wno-ignored-attributes")
+    else()
+      set(GTEST_CMAKE_CXX_FLAGS "-fPIC")
+    endif()
+
+    set(GTEST_PREFIX "${CMAKE_CURRENT_BINARY_DIR}/googletest_ep-prefix/src/googletest_ep")
+    set(GTEST_INCLUDE_DIR "${GTEST_PREFIX}/include")
+    set(GTEST_STATIC_LIB "${GTEST_PREFIX}/${CMAKE_CFG_INTDIR}/${CMAKE_STATIC_LIBRARY_PREFIX}gtest${CMAKE_STATIC_LIBRARY_SUFFIX}")
+    set(GTEST_VENDORED 1)
+
+    if (CMAKE_VERSION VERSION_GREATER "3.2")
+      # BUILD_BYPRODUCTS is a 3.2+ feature
+      ExternalProject_Add(googletest_ep
+        URL "https://github.com/google/googletest/archive/release-${GTEST_VERSION}.tar.gz"
+        CMAKE_ARGS -DCMAKE_CXX_FLAGS=${GTEST_CMAKE_CXX_FLAGS} -Dgtest_force_shared_crt=ON
+        # googletest doesn't define install rules, so just build in the
+        # source dir and don't try to install.  See its README for
+        # details.
+        BUILD_IN_SOURCE 1
+        BUILD_BYPRODUCTS "${GTEST_STATIC_LIB}"
+        INSTALL_COMMAND "")
+    else()
+      ExternalProject_Add(googletest_ep
+        URL "https://github.com/google/googletest/archive/release-${GTEST_VERSION}.tar.gz"
+        CMAKE_ARGS -DCMAKE_CXX_FLAGS=${GTEST_CMAKE_CXX_FLAGS} -Dgtest_force_shared_crt=ON
+        # googletest doesn't define install rules, so just build in the
+        # source dir and don't try to install.  See its README for
+        # details.
+        BUILD_IN_SOURCE 1
+        INSTALL_COMMAND "")
+    endif()
+  else()
+    find_package(GTest REQUIRED)
+    set(GTEST_VENDORED 0)
+  endif()
+
+  message(STATUS "GTest include dir: ${GTEST_INCLUDE_DIR}")
+  message(STATUS "GTest static library: ${GTEST_STATIC_LIB}")
+  include_directories(SYSTEM ${GTEST_INCLUDE_DIR})
+  add_library(gtest STATIC IMPORTED)
+  set_target_properties(gtest PROPERTIES IMPORTED_LOCATION ${GTEST_STATIC_LIB})
+
+  if(GTEST_VENDORED)
+    add_dependencies(gtest googletest_ep)
+  endif()
+endif()
+
+## Google Benchmark
+if ("$ENV{GBENCHMARK_HOME}" STREQUAL "")
+  set(GBENCHMARK_HOME ${THIRDPARTY_DIR}/installed)
+endif()
+
+if(PARQUET_BUILD_BENCHMARKS)
+  add_custom_target(runbenchmark ctest -L benchmark)
+
+  if("$ENV{GBENCHMARK_HOME}" STREQUAL "")
+    if(APPLE)
+      set(GBENCHMARK_CMAKE_CXX_FLAGS "-std=c++11 -stdlib=libc++")
+    else()
+      set(GBENCHMARK_CMAKE_CXX_FLAGS "--std=c++11")
+    endif()
+
+    set(GBENCHMARK_PREFIX "${CMAKE_CURRENT_BINARY_DIR}/gbenchmark_ep/src/gbenchmark_ep-install")
+    set(GBENCHMARK_INCLUDE_DIR "${GBENCHMARK_PREFIX}/include")
+    set(GBENCHMARK_STATIC_LIB "${GBENCHMARK_PREFIX}/lib/${CMAKE_STATIC_LIBRARY_PREFIX}benchmark${CMAKE_STATIC_LIBRARY_SUFFIX}")
+    set(GBENCHMARK_VENDORED 1)
+    set(GBENCHMARK_CMAKE_ARGS
+          "-DCMAKE_BUILD_TYPE=Release"
+          "-DCMAKE_INSTALL_PREFIX:PATH=${GBENCHMARK_PREFIX}"
+          "-DCMAKE_CXX_FLAGS=-fPIC ${GBENCHMARK_CMAKE_CXX_FLAGS}")
+    if (CMAKE_VERSION VERSION_GREATER "3.2")
+      # BUILD_BYPRODUCTS is a 3.2+ feature
+      ExternalProject_Add(gbenchmark_ep
+        URL "https://github.com/google/benchmark/archive/v${GBENCHMARK_VERSION}.tar.gz"
+        BUILD_BYPRODUCTS "${GBENCHMARK_STATIC_LIB}"
+        CMAKE_ARGS ${GBENCHMARK_CMAKE_ARGS})
+    else()
+      ExternalProject_Add(gbenchmark_ep
+        URL "https://github.com/google/benchmark/archive/v${GBENCHMARK_VERSION}.tar.gz"
+        CMAKE_ARGS ${GBENCHMARK_CMAKE_ARGS})
+    endif()
+  else()
+    find_package(GBenchmark REQUIRED)
+    set(GBENCHMARK_VENDORED 0)
+  endif()
+
+  message(STATUS "GBenchmark include dir: ${GBENCHMARK_INCLUDE_DIR}")
+  message(STATUS "GBenchmark static library: ${GBENCHMARK_STATIC_LIB}")
+  include_directories(SYSTEM ${GBENCHMARK_INCLUDE_DIR})
+  add_library(gbenchmark STATIC IMPORTED)
+  set_target_properties(gbenchmark PROPERTIES IMPORTED_LOCATION ${GBENCHMARK_STATIC_LIB})
+
+  if(GBENCHMARK_VENDORED)
+    add_dependencies(gbenchmark gbenchmark_ep)
+  endif()
+endif()
+
+## Apache Arrow
+find_package(Arrow)
+if (NOT ARROW_FOUND)
+  set(ARROW_PREFIX "${CMAKE_CURRENT_BINARY_DIR}/arrow_ep/src/arrow_ep-install")
+  set(ARROW_HOME "${ARROW_PREFIX}")
+  set(ARROW_INCLUDE_DIR "${ARROW_PREFIX}/include")
+  set(ARROW_SHARED_LIB "${ARROW_PREFIX}/lib/libarrow${CMAKE_SHARED_LIBRARY_SUFFIX}")
+  set(ARROW_IO_SHARED_LIB "${ARROW_PREFIX}/lib/libarrow_io${CMAKE_SHARED_LIBRARY_SUFFIX}")
+  set(ARROW_STATIC_LIB "${ARROW_PREFIX}/lib/libarrow.a")
+  set(ARROW_IO_STATIC_LIB "${ARROW_PREFIX}/lib/libarrow_io.a")
+  set(ARROW_CMAKE_ARGS -DCMAKE_BUILD_TYPE=${CMAKE_BUILD_TYPE}
+    -DCMAKE_INSTALL_PREFIX=${ARROW_PREFIX}
+    -DARROW_BUILD_TESTS=OFF)
+
+  if (CMAKE_VERSION VERSION_GREATER "3.2")
+    # BUILD_BYPRODUCTS is a 3.2+ feature
+    ExternalProject_Add(arrow_ep
+      GIT_REPOSITORY https://github.com/apache/arrow.git
+      GIT_TAG ${ARROW_VERSION}
+      BUILD_BYPRODUCTS "${ARROW_SHARED_LIB}" "${ARROW_IO_SHARED_LIB}" "${ARROW_IO_STATIC_LIB}" "${ARROW_STATIC_LIB}"
+      # With CMake 3.7.0 there is a SOURCE_SUBDIR argument which we can use
+      # to specify that the CMakeLists.txt of Arrow is located in cpp/
+      #
+      # See https://gitlab.kitware.com/cmake/cmake/commit/a8345d65f359d75efb057d22976cfb92b4d477cf
+      CONFIGURE_COMMAND "${CMAKE_COMMAND}" ${ARROW_CMAKE_ARGS} ${CMAKE_CURRENT_BINARY_DIR}/arrow_ep-prefix/src/arrow_ep/cpp
+      CMAKE_ARGS ${ARROW_CMAKE_ARGS})
+  else()
+    ExternalProject_Add(arrow_ep
+      GIT_REPOSITORY https://github.com/apache/arrow.git
+      GIT_TAG ${ARROW_VERSION}
+      CONFIGURE_COMMAND "${CMAKE_COMMAND}" ${ARROW_CMAKE_ARGS} ${CMAKE_CURRENT_BINARY_DIR}/arrow_ep-prefix/src/arrow_ep/cpp
+      CMAKE_ARGS ${ARROW_CMAKE_ARGS})
+  endif()
+  set(ARROW_VENDORED 1)
+else()
+  set(ARROW_VENDORED 0)
+endif()
+
+include_directories(SYSTEM ${ARROW_INCLUDE_DIR})
+add_library(arrow SHARED IMPORTED)
+set_target_properties(arrow PROPERTIES IMPORTED_LOCATION ${ARROW_SHARED_LIB})
+add_library(arrow_io SHARED IMPORTED)
+set_target_properties(arrow_io PROPERTIES IMPORTED_LOCATION ${ARROW_IO_SHARED_LIB})
+add_library(arrow_static STATIC IMPORTED)
+set_target_properties(arrow_static PROPERTIES IMPORTED_LOCATION ${ARROW_STATIC_LIB})
+add_library(arrow_io_static STATIC IMPORTED)
+set_target_properties(arrow_io_static PROPERTIES IMPORTED_LOCATION ${ARROW_IO_STATIC_LIB})
+
+if (ARROW_VENDORED)
+  add_dependencies(arrow arrow_ep)
+  add_dependencies(arrow_io arrow_ep)
+  add_dependencies(arrow_static arrow_ep)
+  add_dependencies(arrow_io_static arrow_ep)
+endif()

http://git-wip-us.apache.org/repos/asf/parquet-cpp/blob/2154e873/examples/reader-writer.cc
----------------------------------------------------------------------
diff --git a/examples/reader-writer.cc b/examples/reader-writer.cc
index cc066ac..0289eed 100644
--- a/examples/reader-writer.cc
+++ b/examples/reader-writer.cc
@@ -21,6 +21,8 @@
 #include <list>
 #include <memory>
 
+#include <arrow/io/file.h>
+
 #include <parquet/api/reader.h>
 #include <parquet/api/writer.h>
 
@@ -101,8 +103,9 @@ int main(int argc, char** argv) {
   // parquet::REPEATED fields require both definition and repetition level values
   try {
     // Create a local file output stream instance.
-    std::shared_ptr<parquet::OutputStream> out_file =
-        std::make_shared<parquet::LocalFileOutputStream>(PARQUET_FILENAME);
+    using FileClass = ::arrow::io::FileOutputStream;
+    std::shared_ptr<FileClass> out_file;
+    PARQUET_THROW_NOT_OK(FileClass::Open(PARQUET_FILENAME, &out_file));
 
     // Setup the parquet schema
     std::shared_ptr<GroupNode> schema = SetupSchema();

http://git-wip-us.apache.org/repos/asf/parquet-cpp/blob/2154e873/src/parquet/api/io.h
----------------------------------------------------------------------
diff --git a/src/parquet/api/io.h b/src/parquet/api/io.h
index 683dae2..96d3bc0 100644
--- a/src/parquet/api/io.h
+++ b/src/parquet/api/io.h
@@ -19,9 +19,6 @@
 #define PARQUET_API_IO_H
 
 #include "parquet/exception.h"
-#include "parquet/util/buffer.h"
-#include "parquet/util/input.h"
-#include "parquet/util/mem-allocator.h"
-#include "parquet/util/output.h"
+#include "parquet/util/memory.h"
 
 #endif  // PARQUET_API_IO_H

http://git-wip-us.apache.org/repos/asf/parquet-cpp/blob/2154e873/src/parquet/arrow/CMakeLists.txt
----------------------------------------------------------------------
diff --git a/src/parquet/arrow/CMakeLists.txt b/src/parquet/arrow/CMakeLists.txt
index 37e4894..20f6670 100644
--- a/src/parquet/arrow/CMakeLists.txt
+++ b/src/parquet/arrow/CMakeLists.txt
@@ -19,7 +19,6 @@
 # parquet_arrow : Arrow <-> Parquet adapter
 
 set(PARQUET_ARROW_SRCS
-  io.cc
   reader.cc
   schema.cc
   writer.cc
@@ -76,16 +75,13 @@ if (PARQUET_BUILD_STATIC)
 endif()
 
 ADD_PARQUET_TEST(arrow-schema-test)
-ADD_PARQUET_TEST(arrow-io-test)
 ADD_PARQUET_TEST(arrow-reader-writer-test)
 
 if (PARQUET_BUILD_STATIC)
   ADD_PARQUET_LINK_LIBRARIES(arrow-schema-test parquet_arrow_static)
-  ADD_PARQUET_LINK_LIBRARIES(arrow-io-test parquet_arrow_static)
   ADD_PARQUET_LINK_LIBRARIES(arrow-reader-writer-test parquet_arrow_static)
 else()
   ADD_PARQUET_LINK_LIBRARIES(arrow-schema-test parquet_arrow_shared)
-  ADD_PARQUET_LINK_LIBRARIES(arrow-io-test parquet_arrow_shared)
   ADD_PARQUET_LINK_LIBRARIES(arrow-reader-writer-test parquet_arrow_shared)
 endif()
 
@@ -100,7 +96,6 @@ endif()
 
 # Headers: top level
 install(FILES
-  io.h
   reader.h
   schema.h
   utils.h

http://git-wip-us.apache.org/repos/asf/parquet-cpp/blob/2154e873/src/parquet/arrow/arrow-io-test.cc
----------------------------------------------------------------------
diff --git a/src/parquet/arrow/arrow-io-test.cc b/src/parquet/arrow/arrow-io-test.cc
deleted file mode 100644
index 6d76887..0000000
--- a/src/parquet/arrow/arrow-io-test.cc
+++ /dev/null
@@ -1,140 +0,0 @@
-// Licensed to the Apache Software Foundation (ASF) under one
-// or more contributor license agreements.  See the NOTICE file
-// distributed with this work for additional information
-// regarding copyright ownership.  The ASF licenses this file
-// to you under the Apache License, Version 2.0 (the
-// "License"); you may not use this file except in compliance
-// with the License.  You may obtain a copy of the License at
-//
-//   http://www.apache.org/licenses/LICENSE-2.0
-//
-// Unless required by applicable law or agreed to in writing,
-// software distributed under the License is distributed on an
-// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-// KIND, either express or implied.  See the License for the
-// specific language governing permissions and limitations
-// under the License.
-
-#include <cstdint>
-#include <cstdlib>
-#include <memory>
-#include <string>
-
-#include "gtest/gtest.h"
-
-#include "arrow/api.h"
-#include "arrow/io/memory.h"
-#include "arrow/test-util.h"
-
-#include "parquet/api/io.h"
-#include "parquet/arrow/io.h"
-
-using arrow::default_memory_pool;
-using arrow::MemoryPool;
-using arrow::Status;
-
-using ArrowBufferReader = arrow::io::BufferReader;
-
-namespace parquet {
-namespace arrow {
-
-// Allocator tests
-
-TEST(TestParquetAllocator, DefaultCtor) {
-  ParquetAllocator allocator;
-
-  const int buffer_size = 10;
-
-  uint8_t* buffer = nullptr;
-  ASSERT_NO_THROW(buffer = allocator.Malloc(buffer_size););
-
-  // valgrind will complain if we write into nullptr
-  memset(buffer, 0, buffer_size);
-
-  allocator.Free(buffer, buffer_size);
-}
-
-// Pass through to the default memory pool
-class TrackingPool : public MemoryPool {
- public:
-  TrackingPool() : pool_(default_memory_pool()), bytes_allocated_(0) {}
-
-  Status Allocate(int64_t size, uint8_t** out) override {
-    RETURN_NOT_OK(pool_->Allocate(size, out));
-    bytes_allocated_ += size;
-    return Status::OK();
-  }
-
-  void Free(uint8_t* buffer, int64_t size) override {
-    pool_->Free(buffer, size);
-    bytes_allocated_ -= size;
-  }
-
-  int64_t bytes_allocated() const override { return bytes_allocated_; }
-
- private:
-  MemoryPool* pool_;
-  int64_t bytes_allocated_;
-};
-
-TEST(TestParquetAllocator, CustomPool) {
-  TrackingPool pool;
-
-  ParquetAllocator allocator(&pool);
-
-  ASSERT_EQ(&pool, allocator.pool());
-
-  const int buffer_size = 10;
-
-  uint8_t* buffer = nullptr;
-  ASSERT_NO_THROW(buffer = allocator.Malloc(buffer_size););
-
-  ASSERT_EQ(buffer_size, pool.bytes_allocated());
-
-  // valgrind will complain if we write into nullptr
-  memset(buffer, 0, buffer_size);
-
-  allocator.Free(buffer, buffer_size);
-
-  ASSERT_EQ(0, pool.bytes_allocated());
-}
-
-// ----------------------------------------------------------------------
-// Read source tests
-
-TEST(TestParquetReadSource, Basics) {
-  std::string data = "this is the data";
-  auto data_buffer = reinterpret_cast<const uint8_t*>(data.c_str());
-
-  ParquetAllocator allocator(default_memory_pool());
-
-  auto file = std::make_shared<ArrowBufferReader>(data_buffer, data.size());
-  auto source = std::make_shared<ParquetReadSource>(&allocator);
-
-  ASSERT_OK(source->Open(file));
-
-  ASSERT_EQ(0, source->Tell());
-  ASSERT_NO_THROW(source->Seek(5));
-  ASSERT_EQ(5, source->Tell());
-  ASSERT_NO_THROW(source->Seek(0));
-
-  // Seek out of bounds
-  ASSERT_THROW(source->Seek(100), ParquetException);
-
-  uint8_t buffer[50];
-
-  ASSERT_NO_THROW(source->Read(4, buffer));
-  ASSERT_EQ(0, std::memcmp(buffer, "this", 4));
-  ASSERT_EQ(4, source->Tell());
-
-  std::shared_ptr<Buffer> pq_buffer;
-
-  ASSERT_NO_THROW(pq_buffer = source->Read(7));
-
-  auto expected_buffer = std::make_shared<Buffer>(data_buffer + 4, 7);
-
-  ASSERT_TRUE(expected_buffer->Equals(*pq_buffer.get()));
-}
-
-}  // namespace arrow
-}  // namespace parquet

http://git-wip-us.apache.org/repos/asf/parquet-cpp/blob/2154e873/src/parquet/arrow/arrow-reader-writer-benchmark.cc
----------------------------------------------------------------------
diff --git a/src/parquet/arrow/arrow-reader-writer-benchmark.cc b/src/parquet/arrow/arrow-reader-writer-benchmark.cc
index 89cb486..cf90ebc 100644
--- a/src/parquet/arrow/arrow-reader-writer-benchmark.cc
+++ b/src/parquet/arrow/arrow-reader-writer-benchmark.cc
@@ -23,7 +23,7 @@
 #include "parquet/column/writer.h"
 #include "parquet/file/reader-internal.h"
 #include "parquet/file/writer-internal.h"
-#include "parquet/util/input.h"
+#include "parquet/util/memory.h"
 
 #include "arrow/api.h"
 
@@ -132,8 +132,8 @@ static void BM_ReadColumn(::benchmark::State& state) {
   std::shared_ptr<Buffer> buffer = output->GetBuffer();
 
   while (state.KeepRunning()) {
-    auto reader = ParquetFileReader::Open(
-        std::unique_ptr<RandomAccessSource>(new BufferReader(buffer)));
+    auto reader =
+        ParquetFileReader::Open(std::make_shared<::arrow::io::BufferReader>(buffer));
     FileReader filereader(::arrow::default_memory_pool(), std::move(reader));
     std::shared_ptr<::arrow::Table> table;
     filereader.ReadFlatTable(&table);

http://git-wip-us.apache.org/repos/asf/parquet-cpp/blob/2154e873/src/parquet/arrow/arrow-reader-writer-test.cc
----------------------------------------------------------------------
diff --git a/src/parquet/arrow/arrow-reader-writer-test.cc b/src/parquet/arrow/arrow-reader-writer-test.cc
index 6d2b0d5..07ddd91 100644
--- a/src/parquet/arrow/arrow-reader-writer-test.cc
+++ b/src/parquet/arrow/arrow-reader-writer-test.cc
@@ -29,14 +29,15 @@
 #include "arrow/test-util.h"
 
 using arrow::Array;
+using arrow::Buffer;
 using arrow::ChunkedArray;
 using arrow::default_memory_pool;
+using arrow::io::BufferReader;
 using arrow::PoolBuffer;
 using arrow::PrimitiveArray;
 using arrow::Status;
 using arrow::Table;
 
-using ParquetBuffer = parquet::Buffer;
 using ParquetType = parquet::Type;
 using parquet::schema::GroupNode;
 using parquet::schema::NodePtr;
@@ -203,9 +204,8 @@ class TestParquetIO : public ::testing::Test {
   }
 
   std::unique_ptr<ParquetFileReader> ReaderFromSink() {
-    std::shared_ptr<ParquetBuffer> buffer = sink_->GetBuffer();
-    std::unique_ptr<RandomAccessSource> source(new BufferReader(buffer));
-    return ParquetFileReader::Open(std::move(source));
+    std::shared_ptr<Buffer> buffer = sink_->GetBuffer();
+    return ParquetFileReader::Open(std::make_shared<BufferReader>(buffer));
   }
 
   void ReadSingleColumnFile(
@@ -357,9 +357,9 @@ TYPED_TEST(TestParquetIO, SingleColumnTableRequiredChunkedWriteArrowIO) {
   ASSERT_OK_NO_THROW(WriteFlatTable(
       table.get(), default_memory_pool(), arrow_sink_, 512, default_writer_properties()));
 
-  std::shared_ptr<ParquetBuffer> pbuffer =
-      std::make_shared<ParquetBuffer>(buffer->data(), buffer->size());
-  std::unique_ptr<RandomAccessSource> source(new BufferReader(pbuffer));
+  auto pbuffer = std::make_shared<Buffer>(buffer->data(), buffer->size());
+
+  auto source = std::make_shared<BufferReader>(pbuffer);
   std::shared_ptr<::arrow::Table> out;
   this->ReadTableFromFile(ParquetFileReader::Open(std::move(source)), &out);
   ASSERT_EQ(1, out->num_columns());

http://git-wip-us.apache.org/repos/asf/parquet-cpp/blob/2154e873/src/parquet/arrow/io.cc
----------------------------------------------------------------------
diff --git a/src/parquet/arrow/io.cc b/src/parquet/arrow/io.cc
deleted file mode 100644
index 2b1f99d..0000000
--- a/src/parquet/arrow/io.cc
+++ /dev/null
@@ -1,127 +0,0 @@
-// Licensed to the Apache Software Foundation (ASF) under one
-// or more contributor license agreements.  See the NOTICE file
-// distributed with this work for additional information
-// regarding copyright ownership.  The ASF licenses this file
-// to you under the Apache License, Version 2.0 (the
-// "License"); you may not use this file except in compliance
-// with the License.  You may obtain a copy of the License at
-//
-//   http://www.apache.org/licenses/LICENSE-2.0
-//
-// Unless required by applicable law or agreed to in writing,
-// software distributed under the License is distributed on an
-// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-// KIND, either express or implied.  See the License for the
-// specific language governing permissions and limitations
-// under the License.
-
-#include "parquet/arrow/io.h"
-
-#include <cstdint>
-#include <memory>
-
-#include "parquet/api/io.h"
-#include "parquet/arrow/utils.h"
-
-#include "arrow/status.h"
-
-using arrow::Status;
-using arrow::MemoryPool;
-
-// To assist with readability
-using ArrowROFile = arrow::io::ReadableFileInterface;
-
-namespace parquet {
-namespace arrow {
-
-// ----------------------------------------------------------------------
-// ParquetAllocator
-
-ParquetAllocator::ParquetAllocator() : pool_(::arrow::default_memory_pool()) {}
-
-ParquetAllocator::ParquetAllocator(MemoryPool* pool) : pool_(pool) {}
-
-ParquetAllocator::~ParquetAllocator() {}
-
-uint8_t* ParquetAllocator::Malloc(int64_t size) {
-  uint8_t* result;
-  PARQUET_THROW_NOT_OK(pool_->Allocate(size, &result));
-  return result;
-}
-
-void ParquetAllocator::Free(uint8_t* buffer, int64_t size) {
-  // Does not report Status
-  pool_->Free(buffer, size);
-}
-
-// ----------------------------------------------------------------------
-// ParquetReadSource
-
-ParquetReadSource::ParquetReadSource(ParquetAllocator* allocator)
-    : file_(nullptr), allocator_(allocator) {}
-
-Status ParquetReadSource::Open(const std::shared_ptr<ArrowROFile>& file) {
-  int64_t file_size;
-  RETURN_NOT_OK(file->GetSize(&file_size));
-
-  file_ = file;
-  size_ = file_size;
-  return Status::OK();
-}
-
-void ParquetReadSource::Close() {
-  // TODO(wesm): Make this a no-op for now. This leaves Python wrappers for
-  // these classes in a borked state. Probably better to explicitly close.
-
-  // PARQUET_THROW_NOT_OK(file_->Close());
-}
-
-int64_t ParquetReadSource::Tell() const {
-  int64_t position;
-  PARQUET_THROW_NOT_OK(file_->Tell(&position));
-  return position;
-}
-
-void ParquetReadSource::Seek(int64_t position) {
-  PARQUET_THROW_NOT_OK(file_->Seek(position));
-}
-
-int64_t ParquetReadSource::Read(int64_t nbytes, uint8_t* out) {
-  int64_t bytes_read;
-  PARQUET_THROW_NOT_OK(file_->Read(nbytes, &bytes_read, out));
-  return bytes_read;
-}
-
-std::shared_ptr<Buffer> ParquetReadSource::Read(int64_t nbytes) {
-  // TODO(wesm): This code is duplicated from parquet/util/input.cc; suggests
-  // that there should be more code sharing amongst file-like sources
-  auto result = std::make_shared<OwnedMutableBuffer>(0, allocator_);
-  result->Resize(nbytes);
-
-  int64_t bytes_read = Read(nbytes, result->mutable_data());
-  if (bytes_read < nbytes) { result->Resize(bytes_read); }
-  return result;
-}
-
-ParquetWriteSink::ParquetWriteSink(
-    const std::shared_ptr<::arrow::io::OutputStream>& stream)
-    : stream_(stream) {}
-
-ParquetWriteSink::~ParquetWriteSink() {}
-
-void ParquetWriteSink::Close() {
-  PARQUET_THROW_NOT_OK(stream_->Close());
-}
-
-int64_t ParquetWriteSink::Tell() {
-  int64_t position;
-  PARQUET_THROW_NOT_OK(stream_->Tell(&position));
-  return position;
-}
-
-void ParquetWriteSink::Write(const uint8_t* data, int64_t length) {
-  PARQUET_THROW_NOT_OK(stream_->Write(data, length));
-}
-
-}  // namespace arrow
-}  // namespace parquet

http://git-wip-us.apache.org/repos/asf/parquet-cpp/blob/2154e873/src/parquet/arrow/io.h
----------------------------------------------------------------------
diff --git a/src/parquet/arrow/io.h b/src/parquet/arrow/io.h
deleted file mode 100644
index a1de936..0000000
--- a/src/parquet/arrow/io.h
+++ /dev/null
@@ -1,101 +0,0 @@
-// Licensed to the Apache Software Foundation (ASF) under one
-// or more contributor license agreements.  See the NOTICE file
-// distributed with this work for additional information
-// regarding copyright ownership.  The ASF licenses this file
-// to you under the Apache License, Version 2.0 (the
-// "License"); you may not use this file except in compliance
-// with the License.  You may obtain a copy of the License at
-//
-//   http://www.apache.org/licenses/LICENSE-2.0
-//
-// Unless required by applicable law or agreed to in writing,
-// software distributed under the License is distributed on an
-// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-// KIND, either express or implied.  See the License for the
-// specific language governing permissions and limitations
-// under the License.
-
-// Bridges Arrow's IO interfaces and Parquet-cpp's IO interfaces
-
-#ifndef PARQUET_ARROW_IO_H
-#define PARQUET_ARROW_IO_H
-
-#include <cstdint>
-#include <memory>
-
-#include "parquet/api/io.h"
-
-#include "arrow/io/interfaces.h"
-#include "arrow/memory_pool.h"
-
-namespace parquet {
-
-namespace arrow {
-
-// An implementation of the Parquet MemoryAllocator API that plugs into an
-// existing Arrow memory pool. This way we can direct all allocations to a
-// single place rather than tracking allocations in different locations (for
-// example: without utilizing parquet-cpp's default allocator)
-class PARQUET_EXPORT ParquetAllocator : public MemoryAllocator {
- public:
-  // Uses the default memory pool
-  ParquetAllocator();
-
-  explicit ParquetAllocator(::arrow::MemoryPool* pool);
-  virtual ~ParquetAllocator();
-
-  uint8_t* Malloc(int64_t size) override;
-  void Free(uint8_t* buffer, int64_t size) override;
-
-  void set_pool(::arrow::MemoryPool* pool) { pool_ = pool; }
-
-  ::arrow::MemoryPool* pool() const { return pool_; }
-
- private:
-  ::arrow::MemoryPool* pool_;
-};
-
-class PARQUET_EXPORT ParquetReadSource : public RandomAccessSource {
- public:
-  explicit ParquetReadSource(ParquetAllocator* allocator);
-
-  // We need to ask for the file size on opening the file, and this can fail
-  ::arrow::Status Open(const std::shared_ptr<::arrow::io::ReadableFileInterface>& file);
-
-  void Close() override;
-  int64_t Tell() const override;
-  void Seek(int64_t pos) override;
-  int64_t Read(int64_t nbytes, uint8_t* out) override;
-  std::shared_ptr<Buffer> Read(int64_t nbytes) override;
-
- private:
-  // An Arrow readable file of some kind
-  std::shared_ptr<::arrow::io::ReadableFileInterface> file_;
-
-  // The allocator is required for creating managed buffers
-  ParquetAllocator* allocator_;
-};
-
-class PARQUET_EXPORT ParquetWriteSink : public OutputStream {
- public:
-  explicit ParquetWriteSink(const std::shared_ptr<::arrow::io::OutputStream>& stream);
-
-  virtual ~ParquetWriteSink();
-
-  // Close the output stream
-  void Close() override;
-
-  // Return the current position in the output stream relative to the start
-  int64_t Tell() override;
-
-  // Copy bytes into the output stream
-  void Write(const uint8_t* data, int64_t length) override;
-
- private:
-  std::shared_ptr<::arrow::io::OutputStream> stream_;
-};
-
-}  // namespace arrow
-}  // namespace parquet
-
-#endif  // PARQUET_ARROW_IO_H

http://git-wip-us.apache.org/repos/asf/parquet-cpp/blob/2154e873/src/parquet/arrow/reader.cc
----------------------------------------------------------------------
diff --git a/src/parquet/arrow/reader.cc b/src/parquet/arrow/reader.cc
index 135867c..d1eec05 100644
--- a/src/parquet/arrow/reader.cc
+++ b/src/parquet/arrow/reader.cc
@@ -23,9 +23,7 @@
 #include <string>
 #include <vector>
 
-#include "parquet/arrow/io.h"
 #include "parquet/arrow/schema.h"
-#include "parquet/arrow/utils.h"
 
 #include "arrow/api.h"
 #include "arrow/type_traits.h"
@@ -40,7 +38,6 @@ using arrow::Status;
 using arrow::Table;
 
 // Help reduce verbosity
-using ParquetRAS = parquet::RandomAccessSource;
 using ParquetReader = parquet::ParquetFileReader;
 
 namespace parquet {
@@ -193,16 +190,11 @@ FileReader::~FileReader() {}
 
 // Static ctor
 Status OpenFile(const std::shared_ptr<::arrow::io::ReadableFileInterface>& file,
-    ParquetAllocator* allocator, std::unique_ptr<FileReader>* reader) {
-  std::unique_ptr<ParquetReadSource> source(new ParquetReadSource(allocator));
-  RETURN_NOT_OK(source->Open(file));
-
+    MemoryPool* allocator, std::unique_ptr<FileReader>* reader) {
   // TODO(wesm): reader properties
   std::unique_ptr<ParquetReader> pq_reader;
-  PARQUET_CATCH_NOT_OK(pq_reader = ParquetReader::Open(std::move(source)));
-
-  // Use the same memory pool as the ParquetAllocator
-  reader->reset(new FileReader(allocator->pool(), std::move(pq_reader)));
+  PARQUET_CATCH_NOT_OK(pq_reader = ParquetReader::Open(file));
+  reader->reset(new FileReader(allocator, std::move(pq_reader)));
   return Status::OK();
 }
 
@@ -352,18 +344,18 @@ Status FlatColumnReader::Impl::TypedReadBatch(
   RETURN_NOT_OK(InitDataBuffer<ArrowType>(batch_size));
   valid_bits_idx_ = 0;
   if (descr_->max_definition_level() > 0) {
-    valid_bits_buffer_ = std::make_shared<PoolBuffer>(pool_);
     int valid_bits_size = ::arrow::BitUtil::CeilByte(batch_size) / 8;
-    valid_bits_buffer_->Resize(valid_bits_size);
+    valid_bits_buffer_ = std::make_shared<PoolBuffer>(pool_);
+    RETURN_NOT_OK(valid_bits_buffer_->Resize(valid_bits_size));
     valid_bits_ptr_ = valid_bits_buffer_->mutable_data();
     memset(valid_bits_ptr_, 0, valid_bits_size);
     null_count_ = 0;
   }
 
   while ((values_to_read > 0) && column_reader_) {
-    values_buffer_.Resize(values_to_read * sizeof(ParquetCType));
+    RETURN_NOT_OK(values_buffer_.Resize(values_to_read * sizeof(ParquetCType)));
     if (descr_->max_definition_level() > 0) {
-      def_levels_buffer_.Resize(values_to_read * sizeof(int16_t));
+      RETURN_NOT_OK(def_levels_buffer_.Resize(values_to_read * sizeof(int16_t)));
     }
     auto reader = dynamic_cast<TypedColumnReader<ParquetType>*>(column_reader_.get());
     int64_t values_read;
@@ -427,16 +419,16 @@ Status FlatColumnReader::Impl::TypedReadBatch<::arrow::BooleanType, BooleanType>
   if (descr_->max_definition_level() > 0) {
     valid_bits_buffer_ = std::make_shared<PoolBuffer>(pool_);
     int valid_bits_size = ::arrow::BitUtil::CeilByte(batch_size) / 8;
-    valid_bits_buffer_->Resize(valid_bits_size);
+    RETURN_NOT_OK(valid_bits_buffer_->Resize(valid_bits_size));
     valid_bits_ptr_ = valid_bits_buffer_->mutable_data();
     memset(valid_bits_ptr_, 0, valid_bits_size);
     null_count_ = 0;
   }
 
   while ((values_to_read > 0) && column_reader_) {
-    values_buffer_.Resize(values_to_read * sizeof(bool));
+    RETURN_NOT_OK(values_buffer_.Resize(values_to_read * sizeof(bool)));
     if (descr_->max_definition_level() > 0) {
-      def_levels_buffer_.Resize(values_to_read * sizeof(int16_t));
+      RETURN_NOT_OK(def_levels_buffer_.Resize(values_to_read * sizeof(int16_t)));
     }
     auto reader = dynamic_cast<TypedColumnReader<BooleanType>*>(column_reader_.get());
     int64_t values_read;
@@ -499,9 +491,9 @@ Status FlatColumnReader::Impl::ReadByteArrayBatch(
   int values_to_read = batch_size;
   BuilderType builder(pool_, field_->type);
   while ((values_to_read > 0) && column_reader_) {
-    values_buffer_.Resize(values_to_read * sizeof(ByteArray));
+    RETURN_NOT_OK(values_buffer_.Resize(values_to_read * sizeof(ByteArray)));
     if (descr_->max_definition_level() > 0) {
-      def_levels_buffer_.Resize(values_to_read * sizeof(int16_t));
+      RETURN_NOT_OK(def_levels_buffer_.Resize(values_to_read * sizeof(int16_t)));
     }
     auto reader = dynamic_cast<TypedColumnReader<ByteArrayType>*>(column_reader_.get());
     int64_t values_read;

http://git-wip-us.apache.org/repos/asf/parquet-cpp/blob/2154e873/src/parquet/arrow/reader.h
----------------------------------------------------------------------
diff --git a/src/parquet/arrow/reader.h b/src/parquet/arrow/reader.h
index c6fc47d..2602824 100644
--- a/src/parquet/arrow/reader.h
+++ b/src/parquet/arrow/reader.h
@@ -22,7 +22,6 @@
 
 #include "parquet/api/reader.h"
 #include "parquet/api/schema.h"
-#include "parquet/arrow/io.h"
 
 #include "arrow/io/interfaces.h"
 
@@ -142,7 +141,7 @@ class PARQUET_EXPORT FlatColumnReader {
 // readable file
 PARQUET_EXPORT
 ::arrow::Status OpenFile(const std::shared_ptr<::arrow::io::ReadableFileInterface>& file,
-    ParquetAllocator* allocator, std::unique_ptr<FileReader>* reader);
+    ::arrow::MemoryPool* allocator, std::unique_ptr<FileReader>* reader);
 
 }  // namespace arrow
 }  // namespace parquet

http://git-wip-us.apache.org/repos/asf/parquet-cpp/blob/2154e873/src/parquet/arrow/schema.cc
----------------------------------------------------------------------
diff --git a/src/parquet/arrow/schema.cc b/src/parquet/arrow/schema.cc
index 3e5e7d9..8b2a2ab 100644
--- a/src/parquet/arrow/schema.cc
+++ b/src/parquet/arrow/schema.cc
@@ -21,7 +21,6 @@
 #include <vector>
 
 #include "parquet/api/schema.h"
-#include "parquet/arrow/utils.h"
 
 #include "arrow/api.h"
 

http://git-wip-us.apache.org/repos/asf/parquet-cpp/blob/2154e873/src/parquet/arrow/utils.h
----------------------------------------------------------------------
diff --git a/src/parquet/arrow/utils.h b/src/parquet/arrow/utils.h
deleted file mode 100644
index 9c2abfa..0000000
--- a/src/parquet/arrow/utils.h
+++ /dev/null
@@ -1,54 +0,0 @@
-// Licensed to the Apache Software Foundation (ASF) under one
-// or more contributor license agreements.  See the NOTICE file
-// distributed with this work for additional information
-// regarding copyright ownership.  The ASF licenses this file
-// to you under the Apache License, Version 2.0 (the
-// "License"); you may not use this file except in compliance
-// with the License.  You may obtain a copy of the License at
-//
-//   http://www.apache.org/licenses/LICENSE-2.0
-//
-// Unless required by applicable law or agreed to in writing,
-// software distributed under the License is distributed on an
-// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-// KIND, either express or implied.  See the License for the
-// specific language governing permissions and limitations
-// under the License.
-
-#ifndef PARQUET_ARROW_UTILS_H
-#define PARQUET_ARROW_UTILS_H
-
-#include <sstream>
-
-#include "arrow/status.h"
-#include "parquet/exception.h"
-
-namespace parquet {
-namespace arrow {
-
-#define PARQUET_CATCH_NOT_OK(s)                    \
-  try {                                            \
-    (s);                                           \
-  } catch (const ::parquet::ParquetException& e) { \
-    return ::arrow::Status::IOError(e.what());     \
-  }
-
-#define PARQUET_IGNORE_NOT_OK(s) \
-  try {                          \
-    (s);                         \
-  } catch (const ::parquet::ParquetException& e) {}
-
-#define PARQUET_THROW_NOT_OK(s)               \
-  do {                                        \
-    ::arrow::Status _s = (s);                 \
-    if (!_s.ok()) {                           \
-      std::stringstream ss;                   \
-      ss << "Arrow error: " << _s.ToString(); \
-      ParquetException::Throw(ss.str());      \
-    }                                         \
-  } while (0);
-
-}  // namespace arrow
-}  // namespace parquet
-
-#endif  // PARQUET_ARROW_UTILS_H

http://git-wip-us.apache.org/repos/asf/parquet-cpp/blob/2154e873/src/parquet/arrow/writer.cc
----------------------------------------------------------------------
diff --git a/src/parquet/arrow/writer.cc b/src/parquet/arrow/writer.cc
index b7663a3..f9087ff 100644
--- a/src/parquet/arrow/writer.cc
+++ b/src/parquet/arrow/writer.cc
@@ -22,9 +22,7 @@
 
 #include "parquet/util/logging.h"
 
-#include "parquet/arrow/io.h"
 #include "parquet/arrow/schema.h"
-#include "parquet/arrow/utils.h"
 
 #include "arrow/api.h"
 
@@ -371,8 +369,8 @@ Status WriteFlatTable(const Table* table, MemoryPool* pool,
 Status WriteFlatTable(const Table* table, MemoryPool* pool,
     const std::shared_ptr<::arrow::io::OutputStream>& sink, int64_t chunk_size,
     const std::shared_ptr<WriterProperties>& properties) {
-  auto parquet_sink = std::make_shared<ParquetWriteSink>(sink);
-  return WriteFlatTable(table, pool, parquet_sink, chunk_size, properties);
+  auto wrapper = std::make_shared<ArrowOutputStream>(sink);
+  return WriteFlatTable(table, pool, wrapper, chunk_size, properties);
 }
 
 }  // namespace arrow

http://git-wip-us.apache.org/repos/asf/parquet-cpp/blob/2154e873/src/parquet/column/column-io-benchmark.cc
----------------------------------------------------------------------
diff --git a/src/parquet/column/column-io-benchmark.cc b/src/parquet/column/column-io-benchmark.cc
index 3ff9c32..fb491b9 100644
--- a/src/parquet/column/column-io-benchmark.cc
+++ b/src/parquet/column/column-io-benchmark.cc
@@ -21,7 +21,7 @@
 #include "parquet/column/writer.h"
 #include "parquet/file/reader-internal.h"
 #include "parquet/file/writer-internal.h"
-#include "parquet/util/input.h"
+#include "parquet/util/memory.h"
 
 namespace parquet {
 
@@ -67,9 +67,9 @@ static void BM_WriteInt64Column(::benchmark::State& state) {
       properties, schema.get(), reinterpret_cast<uint8_t*>(&thrift_metadata));
 
   while (state.KeepRunning()) {
-    InMemoryOutputStream dst;
+    InMemoryOutputStream stream;
     std::unique_ptr<Int64Writer> writer = BuildWriter(
-        state.range_x(), &dst, metadata.get(), schema.get(), properties.get());
+        state.range_x(), &stream, metadata.get(), schema.get(), properties.get());
     writer->WriteBatch(
         values.size(), definition_levels.data(), repetition_levels.data(), values.data());
     writer->Close();
@@ -102,14 +102,14 @@ static void BM_ReadInt64Column(::benchmark::State& state) {
   auto metadata = ColumnChunkMetaDataBuilder::Make(
       properties, schema.get(), reinterpret_cast<uint8_t*>(&thrift_metadata));
 
-  InMemoryOutputStream dst;
-  std::unique_ptr<Int64Writer> writer =
-      BuildWriter(state.range_x(), &dst, metadata.get(), schema.get(), properties.get());
+  InMemoryOutputStream stream;
+  std::unique_ptr<Int64Writer> writer = BuildWriter(
+      state.range_x(), &stream, metadata.get(), schema.get(), properties.get());
   writer->WriteBatch(
       values.size(), definition_levels.data(), repetition_levels.data(), values.data());
   writer->Close();
 
-  std::shared_ptr<Buffer> src = dst.GetBuffer();
+  std::shared_ptr<Buffer> src = stream.GetBuffer();
   std::vector<int64_t> values_out(state.range_y());
   std::vector<int16_t> definition_levels_out(state.range_y());
   std::vector<int16_t> repetition_levels_out(state.range_y());

http://git-wip-us.apache.org/repos/asf/parquet-cpp/blob/2154e873/src/parquet/column/column-reader-test.cc
----------------------------------------------------------------------
diff --git a/src/parquet/column/column-reader-test.cc b/src/parquet/column/column-reader-test.cc
index df45e00..5b27b73 100644
--- a/src/parquet/column/column-reader-test.cc
+++ b/src/parquet/column/column-reader-test.cc
@@ -214,7 +214,7 @@ TEST_F(TestPrimitiveReader, TestDictionaryEncodedPages) {
   max_rep_level_ = 0;
   NodePtr type = schema::Int32("a", Repetition::REQUIRED);
   const ColumnDescriptor descr(type, max_def_level_, max_rep_level_);
-  shared_ptr<OwnedMutableBuffer> dummy = std::make_shared<OwnedMutableBuffer>();
+  shared_ptr<PoolBuffer> dummy = std::make_shared<PoolBuffer>();
 
   shared_ptr<DictionaryPage> dict_page =
       std::make_shared<DictionaryPage>(dummy, 0, Encoding::PLAIN);

http://git-wip-us.apache.org/repos/asf/parquet-cpp/blob/2154e873/src/parquet/column/column-writer-test.cc
----------------------------------------------------------------------
diff --git a/src/parquet/column/column-writer-test.cc b/src/parquet/column/column-writer-test.cc
index 5d4daeb..5430005 100644
--- a/src/parquet/column/column-writer-test.cc
+++ b/src/parquet/column/column-writer-test.cc
@@ -26,8 +26,7 @@
 #include "parquet/file/writer-internal.h"
 #include "parquet/types.h"
 #include "parquet/util/comparison.h"
-#include "parquet/util/input.h"
-#include "parquet/util/output.h"
+#include "parquet/util/memory.h"
 
 namespace parquet {
 

http://git-wip-us.apache.org/repos/asf/parquet-cpp/blob/2154e873/src/parquet/column/level-benchmark.cc
----------------------------------------------------------------------
diff --git a/src/parquet/column/level-benchmark.cc b/src/parquet/column/level-benchmark.cc
index c511c36..8ae2fe1 100644
--- a/src/parquet/column/level-benchmark.cc
+++ b/src/parquet/column/level-benchmark.cc
@@ -18,7 +18,7 @@
 #include "benchmark/benchmark.h"
 
 #include "parquet/column/levels.h"
-#include "parquet/util/buffer.h"
+#include "parquet/util/memory.h"
 
 namespace parquet {
 
@@ -31,7 +31,8 @@ static void BM_RleEncoding(::benchmark::State& state) {
       [&state, &n] { return (n++ % state.range_y()) == 0; });
   int16_t max_level = 1;
   int64_t rle_size = LevelEncoder::MaxBufferSize(Encoding::RLE, max_level, levels.size());
-  auto buffer_rle = std::make_shared<OwnedMutableBuffer>(rle_size);
+  auto buffer_rle = std::make_shared<PoolBuffer>();
+  PARQUET_THROW_NOT_OK(buffer_rle->Resize(rle_size));
 
   while (state.KeepRunning()) {
     LevelEncoder level_encoder;
@@ -53,7 +54,8 @@ static void BM_RleDecoding(::benchmark::State& state) {
       [&state, &n] { return (n++ % state.range_y()) == 0; });
   int16_t max_level = 1;
   int64_t rle_size = LevelEncoder::MaxBufferSize(Encoding::RLE, max_level, levels.size());
-  auto buffer_rle = std::make_shared<OwnedMutableBuffer>(rle_size + sizeof(uint32_t));
+  auto buffer_rle = std::make_shared<PoolBuffer>();
+  PARQUET_THROW_NOT_OK(buffer_rle->Resize(rle_size + sizeof(uint32_t)));
   level_encoder.Init(Encoding::RLE, max_level, levels.size(),
       buffer_rle->mutable_data() + sizeof(uint32_t), rle_size);
   level_encoder.Encode(levels.size(), levels.data());

http://git-wip-us.apache.org/repos/asf/parquet-cpp/blob/2154e873/src/parquet/column/page.h
----------------------------------------------------------------------
diff --git a/src/parquet/column/page.h b/src/parquet/column/page.h
index d395480..6670e7f 100644
--- a/src/parquet/column/page.h
+++ b/src/parquet/column/page.h
@@ -28,7 +28,7 @@
 
 #include "parquet/column/statistics.h"
 #include "parquet/types.h"
-#include "parquet/util/buffer.h"
+#include "parquet/util/memory.h"
 
 namespace parquet {
 

http://git-wip-us.apache.org/repos/asf/parquet-cpp/blob/2154e873/src/parquet/column/properties.h
----------------------------------------------------------------------
diff --git a/src/parquet/column/properties.h b/src/parquet/column/properties.h
index f5f2fd5..cf89226 100644
--- a/src/parquet/column/properties.h
+++ b/src/parquet/column/properties.h
@@ -25,8 +25,7 @@
 #include "parquet/exception.h"
 #include "parquet/schema/types.h"
 #include "parquet/types.h"
-#include "parquet/util/input.h"
-#include "parquet/util/mem-allocator.h"
+#include "parquet/util/memory.h"
 #include "parquet/util/visibility.h"
 
 namespace parquet {
@@ -46,7 +45,7 @@ class PARQUET_EXPORT ReaderProperties {
     buffer_size_ = DEFAULT_BUFFER_SIZE;
   }
 
-  MemoryAllocator* allocator() { return allocator_; }
+  MemoryAllocator* allocator() const { return allocator_; }
 
   std::unique_ptr<InputStream> GetStream(
       RandomAccessSource* source, int64_t start, int64_t num_bytes) {

http://git-wip-us.apache.org/repos/asf/parquet-cpp/blob/2154e873/src/parquet/column/reader.h
----------------------------------------------------------------------
diff --git a/src/parquet/column/reader.h b/src/parquet/column/reader.h
index d759b96..bf567d9 100644
--- a/src/parquet/column/reader.h
+++ b/src/parquet/column/reader.h
@@ -31,7 +31,7 @@
 #include "parquet/exception.h"
 #include "parquet/schema/descriptor.h"
 #include "parquet/types.h"
-#include "parquet/util/mem-allocator.h"
+#include "parquet/util/memory.h"
 #include "parquet/util/visibility.h"
 
 namespace parquet {
@@ -221,12 +221,15 @@ inline int64_t TypedColumnReader<DType>::Skip(int64_t num_rows_to_skip) {
       // Jump to the right offset in the Page
       int64_t batch_size = 1024;  // ReadBatch with a smaller memory footprint
       int64_t values_read = 0;
-      auto vals = std::make_shared<OwnedMutableBuffer>(
-          batch_size * type_traits<DType::type_num>::value_byte_size, this->allocator_);
-      auto def_levels = std::make_shared<OwnedMutableBuffer>(
-          batch_size * sizeof(int16_t), this->allocator_);
-      auto rep_levels = std::make_shared<OwnedMutableBuffer>(
-          batch_size * sizeof(int16_t), this->allocator_);
+
+      std::shared_ptr<PoolBuffer> vals = AllocateBuffer(
+          this->allocator_, batch_size * type_traits<DType::type_num>::value_byte_size);
+      std::shared_ptr<PoolBuffer> def_levels =
+          AllocateBuffer(this->allocator_, batch_size * sizeof(int16_t));
+
+      std::shared_ptr<PoolBuffer> rep_levels =
+          AllocateBuffer(this->allocator_, batch_size * sizeof(int16_t));
+
       do {
         batch_size = std::min(batch_size, rows_to_skip);
         values_read =

http://git-wip-us.apache.org/repos/asf/parquet-cpp/blob/2154e873/src/parquet/column/scanner.cc
----------------------------------------------------------------------
diff --git a/src/parquet/column/scanner.cc b/src/parquet/column/scanner.cc
index 8db3d2b..faf99a0 100644
--- a/src/parquet/column/scanner.cc
+++ b/src/parquet/column/scanner.cc
@@ -21,6 +21,7 @@
 #include <memory>
 
 #include "parquet/column/reader.h"
+#include "parquet/util/memory.h"
 
 namespace parquet {
 


Mime
View raw message