arrow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From w...@apache.org
Subject [06/18] arrow git commit: ARROW-1629: [C++] Add miscellaneous DCHECKs and minor changes based on infer tool output
Date Tue, 03 Oct 2017 12:59:48 GMT
ARROW-1629: [C++] Add miscellaneous DCHECKs and minor changes based on infer tool output

This was an interesting journey through some esoterica. I went through all the warnings/errors
that infer (fbinfer.com) outputs and made changes if it seemed warranted. Some of the checks
might be overkill.

See https://gist.github.com/wesm/fc6809e4f4aaef3ecfeb21b8123627bc for a summary of actions
on each warning

Most of the errors that Infer wasn't happy about were already addressed by DCHECKs. This was
useful to go through all these cases -- in nearly all cases the null references are impossible
or would be the result of an error on behalf of the application programmer. For example: we
do not do array boundschecking in most cases in production builds, but these boundschecks
are included in debug builds to assist with catching bugs caused by improper use by application
developers.

As a matter of convention, we should not use error `Status` to do parameter validation or
asserting pre-conditions that are the responsibility of the library user. If parameter validation
is required in binding code (e.g. Python), then this validation should happen in the binding
layer, not in the core C++ library.

There are some other cases where we have a `std::shared_ptr<T>` out variable with code
like:

```
RETURN_NOT_OK(Foo(..., &out));
out->Method(...);
```

Here, infer complains that `out` could contain a null pointer, but our contract with developers
is that if `Foo` returns successfully that `out` is non-null.

Interestingly, infer doesn't like some stack variables that are bound in C++11 lambda expressions.
I noted these in the gist with `LAMBDA`.

Author: Wes McKinney <wes.mckinney@twosigma.com>

Closes #1151 from wesm/fix-infer-issues and squashes the following commits:

f285be95 [Wes McKinney] Restore code paths for empty chunked arrays for backwards compat
5aa86ce2 [Wes McKinney] More DCHECK esoterica / tweaks based on infer report
22c5d361 [Wes McKinney] Address a couple more infer warnings
75131a6b [Wes McKinney] Some more minor infer fixes
5ff3e3a5 [Wes McKinney] Compilation fix
05316ce4 [Wes McKinney] Fix miscellaneous things that infer does not like. Make some Python
helper functions internal / non-exported


Project: http://git-wip-us.apache.org/repos/asf/arrow/repo
Commit: http://git-wip-us.apache.org/repos/asf/arrow/commit/811e668e
Tree: http://git-wip-us.apache.org/repos/asf/arrow/tree/811e668e
Diff: http://git-wip-us.apache.org/repos/asf/arrow/diff/811e668e

Branch: refs/heads/master
Commit: 811e668e8a2af7ed4d0f2f237a223a7297f4bdc9
Parents: 9aa6eb5
Author: Wes McKinney <wes.mckinney@twosigma.com>
Authored: Sun Oct 1 19:32:53 2017 -0500
Committer: Wes McKinney <wes.mckinney@twosigma.com>
Committed: Tue Oct 3 08:59:22 2017 -0400

----------------------------------------------------------------------
 cpp/src/arrow/array.cc                   | 12 ++++++++++--
 cpp/src/arrow/array.h                    |  2 +-
 cpp/src/arrow/buffer.h                   |  2 +-
 cpp/src/arrow/io/hdfs-internal.cc        |  2 ++
 cpp/src/arrow/ipc/message.cc             | 10 ++++++++++
 cpp/src/arrow/ipc/reader.cc              |  8 +++++++-
 cpp/src/arrow/memory_pool.cc             |  3 ++-
 cpp/src/arrow/python/CMakeLists.txt      |  2 +-
 cpp/src/arrow/python/arrow_to_pandas.cc  |  7 ++++---
 cpp/src/arrow/python/arrow_to_python.cc  |  4 ++++
 cpp/src/arrow/python/builtin_convert.cc  |  2 +-
 cpp/src/arrow/python/helpers.cc          | 12 ++++++------
 cpp/src/arrow/python/helpers.h           | 24 +++++++++++++-----------
 cpp/src/arrow/python/numpy_to_arrow.cc   | 25 +++++++++++++------------
 cpp/src/arrow/python/python-test.cc      |  6 +++---
 cpp/src/arrow/table-test.cc              |  4 ----
 cpp/src/arrow/table.cc                   |  9 ++++++---
 cpp/src/arrow/table.h                    |  2 +-
 cpp/src/arrow/util/bit-util.cc           |  7 ++++---
 cpp/src/arrow/util/key_value_metadata.cc |  6 ++++--
 cpp/src/arrow/util/logging.h             |  2 +-
 cpp/src/plasma/client.cc                 |  1 +
 cpp/src/plasma/fling.cc                  |  3 +++
 cpp/src/plasma/io.cc                     |  1 +
 cpp/src/plasma/store.cc                  |  7 +++----
 25 files changed, 102 insertions(+), 61 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/arrow/blob/811e668e/cpp/src/arrow/array.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/array.cc b/cpp/src/arrow/array.cc
index 92e8d0f..12922ae 100644
--- a/cpp/src/arrow/array.cc
+++ b/cpp/src/arrow/array.cc
@@ -224,6 +224,8 @@ Status ListArray::FromArrays(const Array& offsets, const Array&
values,
 
 void ListArray::SetData(const std::shared_ptr<ArrayData>& data) {
   this->Array::SetData(data);
+  DCHECK_EQ(data->buffers.size(), 2);
+
   auto value_offsets = data->buffers[1];
   raw_value_offsets_ = value_offsets == nullptr
                            ? nullptr
@@ -246,6 +248,7 @@ BinaryArray::BinaryArray(const std::shared_ptr<ArrayData>& data)
{
 }
 
 void BinaryArray::SetData(const std::shared_ptr<ArrayData>& data) {
+  DCHECK_EQ(data->buffers.size(), 3);
   auto value_offsets = data->buffers[1];
   auto value_data = data->buffers[2];
   this->Array::SetData(data);
@@ -342,6 +345,7 @@ std::shared_ptr<Array> StructArray::field(int i) const {
   if (!boxed_fields_[i]) {
     DCHECK(MakeArray(data_->child_data[i], &boxed_fields_[i]).ok());
   }
+  DCHECK(boxed_fields_[i]);
   return boxed_fields_[i];
 }
 
@@ -351,6 +355,8 @@ std::shared_ptr<Array> StructArray::field(int i) const {
 void UnionArray::SetData(const std::shared_ptr<ArrayData>& data) {
   this->Array::SetData(data);
 
+  DCHECK_EQ(data->buffers.size(), 3);
+
   auto type_ids = data_->buffers[1];
   auto value_offsets = data_->buffers[2];
   raw_type_ids_ =
@@ -385,6 +391,7 @@ std::shared_ptr<Array> UnionArray::child(int i) const {
   if (!boxed_fields_[i]) {
     DCHECK(MakeArray(data_->child_data[i], &boxed_fields_[i]).ok());
   }
+  DCHECK(boxed_fields_[i]);
   return boxed_fields_[i];
 }
 
@@ -594,10 +601,11 @@ class ArrayDataWrapper {
 
 }  // namespace internal
 
-// Remove enclosing namespace after 0.7.0
 Status MakeArray(const std::shared_ptr<ArrayData>& data, std::shared_ptr<Array>*
out) {
   internal::ArrayDataWrapper wrapper_visitor(data, out);
-  return VisitTypeInline(*data->type, &wrapper_visitor);
+  RETURN_NOT_OK(VisitTypeInline(*data->type, &wrapper_visitor));
+  DCHECK(out);
+  return Status::OK();
 }
 
 #ifndef ARROW_NO_DEPRECATED_API

http://git-wip-us.apache.org/repos/asf/arrow/blob/811e668e/cpp/src/arrow/array.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/array.h b/cpp/src/arrow/array.h
index ee29d95..4ad60eb 100644
--- a/cpp/src/arrow/array.h
+++ b/cpp/src/arrow/array.h
@@ -84,7 +84,7 @@ struct Decimal;
 /// input array and replace them with newly-allocated data, changing the output
 /// data type as well.
 struct ARROW_EXPORT ArrayData {
-  ArrayData() {}
+  ArrayData() : length(0) {}
 
   ArrayData(const std::shared_ptr<DataType>& type, int64_t length,
             int64_t null_count = kUnknownNullCount, int64_t offset = 0)

http://git-wip-us.apache.org/repos/asf/arrow/blob/811e668e/cpp/src/arrow/buffer.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/buffer.h b/cpp/src/arrow/buffer.h
index dbd9376..5f61ade 100644
--- a/cpp/src/arrow/buffer.h
+++ b/cpp/src/arrow/buffer.h
@@ -201,7 +201,7 @@ class ARROW_EXPORT BufferBuilder {
     if (elements == 0) {
       return Status::OK();
     }
-    if (capacity_ == 0) {
+    if (buffer_ == nullptr) {
       buffer_ = std::make_shared<PoolBuffer>(pool_);
     }
     int64_t old_capacity = capacity_;

http://git-wip-us.apache.org/repos/asf/arrow/blob/811e668e/cpp/src/arrow/io/hdfs-internal.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/io/hdfs-internal.cc b/cpp/src/arrow/io/hdfs-internal.cc
index e6d0487..9cd1c50 100644
--- a/cpp/src/arrow/io/hdfs-internal.cc
+++ b/cpp/src/arrow/io/hdfs-internal.cc
@@ -44,6 +44,7 @@
 #include <boost/filesystem.hpp>  // NOLINT
 
 #include "arrow/status.h"
+#include "arrow/util/logging.h"
 
 namespace fs = boost::filesystem;
 
@@ -346,6 +347,7 @@ bool LibHdfsShim::HasPread() {
 tSize LibHdfsShim::Pread(hdfsFS fs, hdfsFile file, tOffset position, void* buffer,
                          tSize length) {
   GET_SYMBOL(this, hdfsPread);
+  DCHECK(this->hdfsPread);
   return this->hdfsPread(fs, file, position, buffer, length);
 }
 

http://git-wip-us.apache.org/repos/asf/arrow/blob/811e668e/cpp/src/arrow/ipc/message.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/ipc/message.cc b/cpp/src/arrow/ipc/message.cc
index 082c925..0c587ab 100644
--- a/cpp/src/arrow/ipc/message.cc
+++ b/cpp/src/arrow/ipc/message.cc
@@ -29,6 +29,7 @@
 #include "arrow/ipc/Schema_generated.h"
 #include "arrow/ipc/metadata-internal.h"
 #include "arrow/status.h"
+#include "arrow/util/logging.h"
 
 namespace arrow {
 namespace ipc {
@@ -194,9 +195,18 @@ std::string FormatMessageType(Message::Type type) {
 
 Status ReadMessage(int64_t offset, int32_t metadata_length, io::RandomAccessFile* file,
                    std::unique_ptr<Message>* message) {
+  DCHECK_GT(static_cast<size_t>(metadata_length), sizeof(int32_t));
+
   std::shared_ptr<Buffer> buffer;
   RETURN_NOT_OK(file->ReadAt(offset, metadata_length, &buffer));
 
+  if (buffer->size() < metadata_length) {
+    std::stringstream ss;
+    ss << "Expected to read " << metadata_length << " metadata bytes but
got "
+       << buffer->size();
+    return Status::Invalid(ss.str());
+  }
+
   int32_t flatbuffer_size = *reinterpret_cast<const int32_t*>(buffer->data());
 
   if (flatbuffer_size + static_cast<int>(sizeof(int32_t)) > metadata_length) {

http://git-wip-us.apache.org/repos/asf/arrow/blob/811e668e/cpp/src/arrow/ipc/reader.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/ipc/reader.cc b/cpp/src/arrow/ipc/reader.cc
index 09def6e..e6ba50e 100644
--- a/cpp/src/arrow/ipc/reader.cc
+++ b/cpp/src/arrow/ipc/reader.cc
@@ -349,7 +349,6 @@ Status ReadDictionary(const Buffer& metadata, const DictionaryTypeMap&
dictionar
       reinterpret_cast<const flatbuf::RecordBatch*>(dictionary_batch->data());
   RETURN_NOT_OK(
       ReadRecordBatch(batch_meta, dummy_schema, kMaxNestingDepth, file, &batch));
-
   if (batch->num_columns() != 1) {
     return Status::Invalid("Dictionary record batch must only contain one field");
   }
@@ -526,6 +525,13 @@ class RecordBatchFileReader::RecordBatchFileReaderImpl {
     int file_end_size = static_cast<int>(magic_size + sizeof(int32_t));
     RETURN_NOT_OK(file_->ReadAt(footer_offset_ - file_end_size, file_end_size, &buffer));
 
+    const int64_t expected_footer_size = magic_size + sizeof(int32_t);
+    if (buffer->size() < expected_footer_size) {
+      std::stringstream ss;
+      ss << "Unable to read " << expected_footer_size << "from end of file";
+      return Status::Invalid(ss.str());
+    }
+
     if (memcmp(buffer->data() + sizeof(int32_t), kArrowMagicBytes, magic_size)) {
       return Status::Invalid("Not an Arrow file");
     }

http://git-wip-us.apache.org/repos/asf/arrow/blob/811e668e/cpp/src/arrow/memory_pool.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/memory_pool.cc b/cpp/src/arrow/memory_pool.cc
index d86fb08..851065b 100644
--- a/cpp/src/arrow/memory_pool.cc
+++ b/cpp/src/arrow/memory_pool.cc
@@ -112,8 +112,9 @@ Status DefaultMemoryPool::Reallocate(int64_t old_size, int64_t new_size,
uint8_t
   // Note: We cannot use realloc() here as it doesn't guarantee alignment.
 
   // Allocate new chunk
-  uint8_t* out;
+  uint8_t* out = nullptr;
   RETURN_NOT_OK(AllocateAligned(new_size, &out));
+  DCHECK(out);
   // Copy contents and release old memory chunk
   memcpy(out, *ptr, static_cast<size_t>(std::min(new_size, old_size)));
 #ifdef _MSC_VER

http://git-wip-us.apache.org/repos/asf/arrow/blob/811e668e/cpp/src/arrow/python/CMakeLists.txt
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/python/CMakeLists.txt b/cpp/src/arrow/python/CMakeLists.txt
index 7938d84..af53a16 100644
--- a/cpp/src/arrow/python/CMakeLists.txt
+++ b/cpp/src/arrow/python/CMakeLists.txt
@@ -40,7 +40,7 @@ endif()
 
 set(ARROW_PYTHON_MIN_TEST_LIBS
   arrow_python_test_main
-  arrow_python_shared
+  arrow_python_static
   arrow_shared)
 
 set(ARROW_PYTHON_TEST_LINK_LIBS ${ARROW_PYTHON_MIN_TEST_LIBS})

http://git-wip-us.apache.org/repos/asf/arrow/blob/811e668e/cpp/src/arrow/python/arrow_to_pandas.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/python/arrow_to_pandas.cc b/cpp/src/arrow/python/arrow_to_pandas.cc
index be738e7..88b594c 100644
--- a/cpp/src/arrow/python/arrow_to_pandas.cc
+++ b/cpp/src/arrow/python/arrow_to_pandas.cc
@@ -615,8 +615,8 @@ static Status ConvertDecimals(PandasOptions options, const ChunkedArray&
data,
   PyAcquireGIL lock;
   OwnedRef decimal_ref;
   OwnedRef Decimal_ref;
-  RETURN_NOT_OK(ImportModule("decimal", &decimal_ref));
-  RETURN_NOT_OK(ImportFromModule(decimal_ref, "Decimal", &Decimal_ref));
+  RETURN_NOT_OK(internal::ImportModule("decimal", &decimal_ref));
+  RETURN_NOT_OK(internal::ImportFromModule(decimal_ref, "Decimal", &Decimal_ref));
   PyObject* Decimal = Decimal_ref.obj();
 
   for (int c = 0; c < data.num_chunks(); c++) {
@@ -633,7 +633,8 @@ static Status ConvertDecimals(PandasOptions options, const ChunkedArray&
data,
         const uint8_t* raw_value = arr->GetValue(i);
         std::string decimal_string;
         RETURN_NOT_OK(RawDecimalToString(raw_value, precision, scale, &decimal_string));
-        RETURN_NOT_OK(DecimalFromString(Decimal, decimal_string, out_values++));
+        *out_values++ = internal::DecimalFromString(Decimal, decimal_string);
+        RETURN_IF_PYERROR();
       }
     }
   }

http://git-wip-us.apache.org/repos/asf/arrow/blob/811e668e/cpp/src/arrow/python/arrow_to_python.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/python/arrow_to_python.cc b/cpp/src/arrow/python/arrow_to_python.cc
index a281fe3..b4f4a41 100644
--- a/cpp/src/arrow/python/arrow_to_python.cc
+++ b/cpp/src/arrow/python/arrow_to_python.cc
@@ -59,6 +59,10 @@ Status DeserializeDict(PyObject* context, const Array& array, int64_t
start_idx,
   const auto& data = static_cast<const StructArray&>(array);
   ScopedRef keys, vals;
   ScopedRef result(PyDict_New());
+  RETURN_IF_PYERROR();
+
+  DCHECK_EQ(2, data.num_fields());
+
   RETURN_NOT_OK(DeserializeList(context, *data.field(0), start_idx, stop_idx, base,
                                 tensors, keys.ref()));
   RETURN_NOT_OK(DeserializeList(context, *data.field(1), start_idx, stop_idx, base,

http://git-wip-us.apache.org/repos/asf/arrow/blob/811e668e/cpp/src/arrow/python/builtin_convert.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/python/builtin_convert.cc b/cpp/src/arrow/python/builtin_convert.cc
index f9d7361..69a19e7 100644
--- a/cpp/src/arrow/python/builtin_convert.cc
+++ b/cpp/src/arrow/python/builtin_convert.cc
@@ -656,7 +656,7 @@ class DecimalConverter
   inline Status AppendItem(const OwnedRef& item) {
     /// TODO(phillipc): Check for nan?
     std::string string;
-    RETURN_NOT_OK(PythonDecimalToString(item.obj(), &string));
+    RETURN_NOT_OK(internal::PythonDecimalToString(item.obj(), &string));
 
     Decimal128 value;
     RETURN_NOT_OK(Decimal128::FromString(string, &value));

http://git-wip-us.apache.org/repos/asf/arrow/blob/811e668e/cpp/src/arrow/python/helpers.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/python/helpers.cc b/cpp/src/arrow/python/helpers.cc
index fb2fed7..ad6a7f1 100644
--- a/cpp/src/arrow/python/helpers.cc
+++ b/cpp/src/arrow/python/helpers.cc
@@ -55,6 +55,8 @@ std::shared_ptr<DataType> GetPrimitiveType(Type::type type) {
   }
 }
 
+namespace internal {
+
 Status ImportModule(const std::string& module_name, OwnedRef* ref) {
   PyObject* module = PyImport_ImportModule(module_name.c_str());
   RETURN_IF_PYERROR();
@@ -106,10 +108,9 @@ Status InferDecimalPrecisionAndScale(PyObject* python_decimal, int* precision,
   return Decimal128::FromString(c_string, nullptr, precision, scale);
 }
 
-Status DecimalFromString(PyObject* decimal_constructor, const std::string& decimal_string,
-                         PyObject** out) {
+PyObject* DecimalFromString(PyObject* decimal_constructor,
+                            const std::string& decimal_string) {
   DCHECK_NE(decimal_constructor, nullptr);
-  DCHECK_NE(out, nullptr);
 
   auto string_size = decimal_string.size();
   DCHECK_GT(string_size, 0);
@@ -117,11 +118,10 @@ Status DecimalFromString(PyObject* decimal_constructor, const std::string&
decim
   auto string_bytes = decimal_string.c_str();
   DCHECK_NE(string_bytes, nullptr);
 
-  *out = PyObject_CallFunction(decimal_constructor, const_cast<char*>("s#"), string_bytes,
+  return PyObject_CallFunction(decimal_constructor, const_cast<char*>("s#"), string_bytes,
                                string_size);
-  RETURN_IF_PYERROR();
-  return Status::OK();
 }
 
+}  // namespace internal
 }  // namespace py
 }  // namespace arrow

http://git-wip-us.apache.org/repos/asf/arrow/blob/811e668e/cpp/src/arrow/python/helpers.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/python/helpers.h b/cpp/src/arrow/python/helpers.h
index 8b8c667..01ab916 100644
--- a/cpp/src/arrow/python/helpers.h
+++ b/cpp/src/arrow/python/helpers.h
@@ -28,26 +28,28 @@
 #include "arrow/util/visibility.h"
 
 namespace arrow {
-
 namespace py {
 
 class OwnedRef;
 
-ARROW_EXPORT std::shared_ptr<DataType> GetPrimitiveType(Type::type type);
+ARROW_EXPORT
+std::shared_ptr<DataType> GetPrimitiveType(Type::type type);
+
+namespace internal {
 
-Status ARROW_EXPORT ImportModule(const std::string& module_name, OwnedRef* ref);
-Status ARROW_EXPORT ImportFromModule(const OwnedRef& module,
-                                     const std::string& module_name, OwnedRef* ref);
+Status ImportModule(const std::string& module_name, OwnedRef* ref);
+Status ImportFromModule(const OwnedRef& module, const std::string& module_name,
+                        OwnedRef* ref);
 
-Status ARROW_EXPORT PythonDecimalToString(PyObject* python_decimal, std::string* out);
+Status PythonDecimalToString(PyObject* python_decimal, std::string* out);
 
-Status ARROW_EXPORT InferDecimalPrecisionAndScale(PyObject* python_decimal,
-                                                  int* precision = nullptr,
-                                                  int* scale = nullptr);
+Status InferDecimalPrecisionAndScale(PyObject* python_decimal, int* precision = nullptr,
+                                     int* scale = nullptr);
 
-Status ARROW_EXPORT DecimalFromString(PyObject* decimal_constructor,
-                                      const std::string& decimal_string, PyObject** out);
+PyObject* DecimalFromString(PyObject* decimal_constructor,
+                            const std::string& decimal_string);
 
+}  // namespace internal
 }  // namespace py
 }  // namespace arrow
 

http://git-wip-us.apache.org/repos/asf/arrow/blob/811e668e/cpp/src/arrow/python/numpy_to_arrow.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/python/numpy_to_arrow.cc b/cpp/src/arrow/python/numpy_to_arrow.cc
index 8845ee7..c0ce61c 100644
--- a/cpp/src/arrow/python/numpy_to_arrow.cc
+++ b/cpp/src/arrow/python/numpy_to_arrow.cc
@@ -652,8 +652,8 @@ Status NumPyConverter::ConvertDecimals() {
   // Import the decimal module and Decimal class
   OwnedRef decimal;
   OwnedRef Decimal;
-  RETURN_NOT_OK(ImportModule("decimal", &decimal));
-  RETURN_NOT_OK(ImportFromModule(decimal, "Decimal", &Decimal));
+  RETURN_NOT_OK(internal::ImportModule("decimal", &decimal));
+  RETURN_NOT_OK(internal::ImportFromModule(decimal, "Decimal", &Decimal));
 
   Ndarray1DIndexer<PyObject*> objects(arr_);
   PyObject* object = objects[0];
@@ -661,7 +661,7 @@ Status NumPyConverter::ConvertDecimals() {
   int precision;
   int scale;
 
-  RETURN_NOT_OK(InferDecimalPrecisionAndScale(object, &precision, &scale));
+  RETURN_NOT_OK(internal::InferDecimalPrecisionAndScale(object, &precision, &scale));
 
   type_ = std::make_shared<DecimalType>(precision, scale);
 
@@ -672,7 +672,7 @@ Status NumPyConverter::ConvertDecimals() {
     object = objects[i];
     if (PyObject_IsInstance(object, Decimal.obj())) {
       std::string string;
-      RETURN_NOT_OK(PythonDecimalToString(object, &string));
+      RETURN_NOT_OK(internal::PythonDecimalToString(object, &string));
 
       Decimal128 value;
       RETURN_NOT_OK(Decimal128::FromString(string, &value));
@@ -823,7 +823,7 @@ Status NumPyConverter::ConvertObjectFixedWidthBytes(
     const std::shared_ptr<DataType>& type) {
   PyAcquireGIL lock;
 
-  int32_t byte_width = static_cast<const FixedSizeBinaryType&>(*type).byte_width();
+  const int32_t byte_width = static_cast<const FixedSizeBinaryType&>(*type).byte_width();
 
   // The output type at this point is inconclusive because there may be bytes
   // and unicode mixed in the object array
@@ -893,8 +893,8 @@ Status NumPyConverter::ConvertObjectsInfer() {
 
   OwnedRef decimal;
   OwnedRef Decimal;
-  RETURN_NOT_OK(ImportModule("decimal", &decimal));
-  RETURN_NOT_OK(ImportFromModule(decimal, "Decimal", &Decimal));
+  RETURN_NOT_OK(internal::ImportModule("decimal", &decimal));
+  RETURN_NOT_OK(internal::ImportFromModule(decimal, "Decimal", &Decimal));
 
   for (int64_t i = 0; i < length_; ++i) {
     PyObject* obj = objects[i];
@@ -935,7 +935,7 @@ Status NumPyConverter::ConvertObjectsInfer() {
 Status NumPyConverter::ConvertObjectsInferAndCast() {
   size_t position = out_arrays_.size();
   RETURN_NOT_OK(ConvertObjectsInfer());
-
+  DCHECK_EQ(position + 1, out_arrays_.size());
   std::shared_ptr<Array> arr = out_arrays_[position];
 
   // Perform cast
@@ -1182,10 +1182,10 @@ Status NumPyConverter::ConvertLists(const std::shared_ptr<DataType>&
type,
     LIST_CASE(DOUBLE, NPY_DOUBLE, DoubleType)
     LIST_CASE(STRING, NPY_OBJECT, StringType)
     case Type::LIST: {
-      const ListType& list_type = static_cast<const ListType&>(*type);
+      const auto& list_type = static_cast<const ListType&>(*type);
       auto value_builder = static_cast<ListBuilder*>(builder->value_builder());
 
-      auto foreach_item = [&](PyObject* object) {
+      auto foreach_item = [this, &builder, &value_builder, &list_type](PyObject*
object) {
         if (PandasObjectIsNull(object)) {
           return builder->AppendNull();
         } else {
@@ -1219,8 +1219,9 @@ Status NdarrayToArrow(MemoryPool* pool, PyObject* ao, PyObject* mo,
                       std::shared_ptr<ChunkedArray>* out) {
   NumPyConverter converter(pool, ao, mo, type, use_pandas_null_sentinels);
   RETURN_NOT_OK(converter.Convert());
-  DCHECK(converter.result()[0]);
-  *out = std::make_shared<ChunkedArray>(converter.result());
+  const auto& output_arrays = converter.result();
+  DCHECK_GT(output_arrays.size(), 0);
+  *out = std::make_shared<ChunkedArray>(output_arrays);
   return Status::OK();
 }
 

http://git-wip-us.apache.org/repos/asf/arrow/blob/811e668e/cpp/src/arrow/python/python-test.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/python/python-test.cc b/cpp/src/arrow/python/python-test.cc
index e1796c0..86391a1 100644
--- a/cpp/src/arrow/python/python-test.cc
+++ b/cpp/src/arrow/python/python-test.cc
@@ -39,10 +39,10 @@ TEST(DecimalTest, TestPythonDecimalToString) {
 
   OwnedRef decimal;
   OwnedRef Decimal;
-  ASSERT_OK(ImportModule("decimal", &decimal));
+  ASSERT_OK(internal::ImportModule("decimal", &decimal));
   ASSERT_NE(decimal.obj(), nullptr);
 
-  ASSERT_OK(ImportFromModule(decimal, "Decimal", &Decimal));
+  ASSERT_OK(internal::ImportFromModule(decimal, "Decimal", &Decimal));
   ASSERT_NE(Decimal.obj(), nullptr);
 
   std::string decimal_string("-39402950693754869342983");
@@ -61,7 +61,7 @@ TEST(DecimalTest, TestPythonDecimalToString) {
   ASSERT_NE(python_object, nullptr);
 
   std::string string_result;
-  ASSERT_OK(PythonDecimalToString(python_object, &string_result));
+  ASSERT_OK(internal::PythonDecimalToString(python_object, &string_result));
 }
 
 TEST(PandasConversionTest, TestObjectBlockWriteFails) {

http://git-wip-us.apache.org/repos/asf/arrow/blob/811e668e/cpp/src/arrow/table-test.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/table-test.cc b/cpp/src/arrow/table-test.cc
index b0aeed1..b490310 100644
--- a/cpp/src/arrow/table-test.cc
+++ b/cpp/src/arrow/table-test.cc
@@ -140,10 +140,6 @@ TEST_F(TestColumn, BasicAPI) {
   ASSERT_EQ(300, column_->length());
   ASSERT_EQ(30, column_->null_count());
   ASSERT_EQ(3, column_->data()->num_chunks());
-
-  // nullptr array should not break
-  column_.reset(new Column(f0, std::shared_ptr<Array>(nullptr)));
-  ASSERT_NE(column_.get(), nullptr);
 }
 
 TEST_F(TestColumn, ChunksInhomogeneous) {

http://git-wip-us.apache.org/repos/asf/arrow/blob/811e668e/cpp/src/arrow/table.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/table.cc b/cpp/src/arrow/table.cc
index d0bbe7e..009b5cf 100644
--- a/cpp/src/arrow/table.cc
+++ b/cpp/src/arrow/table.cc
@@ -42,6 +42,8 @@ ChunkedArray::ChunkedArray(const ArrayVector& chunks) : chunks_(chunks)
{
   }
 }
 
+std::shared_ptr<DataType> ChunkedArray::type() const { return chunks_[0]->type();
}
+
 bool ChunkedArray::Equals(const ChunkedArray& other) const {
   if (length_ != other.length()) {
     return false;
@@ -105,10 +107,10 @@ Column::Column(const std::shared_ptr<Field>& field, const
ArrayVector& chunks)
 
 Column::Column(const std::shared_ptr<Field>& field, const std::shared_ptr<Array>&
data)
     : field_(field) {
-  if (data) {
-    data_ = std::make_shared<ChunkedArray>(ArrayVector({data}));
-  } else {
+  if (!data) {
     data_ = std::make_shared<ChunkedArray>(ArrayVector({}));
+  } else {
+    data_ = std::make_shared<ChunkedArray>(ArrayVector({data}));
   }
 }
 
@@ -192,6 +194,7 @@ std::shared_ptr<Array> RecordBatch::column(int i) const {
   if (!boxed_columns_[i]) {
     DCHECK(MakeArray(columns_[i], &boxed_columns_[i]).ok());
   }
+  DCHECK(boxed_columns_[i]);
   return boxed_columns_[i];
 }
 

http://git-wip-us.apache.org/repos/asf/arrow/blob/811e668e/cpp/src/arrow/table.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/table.h b/cpp/src/arrow/table.h
index 85fa234..324112b 100644
--- a/cpp/src/arrow/table.h
+++ b/cpp/src/arrow/table.h
@@ -53,7 +53,7 @@ class ARROW_EXPORT ChunkedArray {
 
   const ArrayVector& chunks() const { return chunks_; }
 
-  std::shared_ptr<DataType> type() const { return chunks_[0]->type(); }
+  std::shared_ptr<DataType> type() const;
 
   bool Equals(const ChunkedArray& other) const;
   bool Equals(const std::shared_ptr<ChunkedArray>& other) const;

http://git-wip-us.apache.org/repos/asf/arrow/blob/811e668e/cpp/src/arrow/util/bit-util.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/util/bit-util.cc b/cpp/src/arrow/util/bit-util.cc
index 15bf359..e0116cc 100644
--- a/cpp/src/arrow/util/bit-util.cc
+++ b/cpp/src/arrow/util/bit-util.cc
@@ -31,6 +31,7 @@
 #include "arrow/memory_pool.h"
 #include "arrow/status.h"
 #include "arrow/util/bit-util.h"
+#include "arrow/util/logging.h"
 
 namespace arrow {
 
@@ -48,9 +49,9 @@ Status BitUtil::BytesToBits(const std::vector<uint8_t>& bytes,
MemoryPool* pool,
 
   std::shared_ptr<Buffer> buffer;
   RETURN_NOT_OK(AllocateBuffer(pool, bit_length, &buffer));
-
-  memset(buffer->mutable_data(), 0, static_cast<size_t>(bit_length));
-  FillBitsFromBytes(bytes, buffer->mutable_data());
+  uint8_t* out_buf = buffer->mutable_data();
+  memset(out_buf, 0, static_cast<size_t>(bit_length));
+  FillBitsFromBytes(bytes, out_buf);
 
   *out = buffer;
   return Status::OK();

http://git-wip-us.apache.org/repos/asf/arrow/blob/811e668e/cpp/src/arrow/util/key_value_metadata.cc
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/util/key_value_metadata.cc b/cpp/src/arrow/util/key_value_metadata.cc
index cf74ddf..4f37953 100644
--- a/cpp/src/arrow/util/key_value_metadata.cc
+++ b/cpp/src/arrow/util/key_value_metadata.cc
@@ -89,12 +89,14 @@ int64_t KeyValueMetadata::size() const {
 
 std::string KeyValueMetadata::key(int64_t i) const {
   DCHECK_GE(i, 0);
-  return keys_[static_cast<size_t>(i)];
+  DCHECK_LT(static_cast<size_t>(i), keys_.size());
+  return keys_[i];
 }
 
 std::string KeyValueMetadata::value(int64_t i) const {
   DCHECK_GE(i, 0);
-  return values_[static_cast<size_t>(i)];
+  DCHECK_LT(static_cast<size_t>(i), values_.size());
+  return values_[i];
 }
 
 std::shared_ptr<KeyValueMetadata> KeyValueMetadata::Copy() const {

http://git-wip-us.apache.org/repos/asf/arrow/blob/811e668e/cpp/src/arrow/util/logging.h
----------------------------------------------------------------------
diff --git a/cpp/src/arrow/util/logging.h b/cpp/src/arrow/util/logging.h
index 40a51cb..39815f3 100644
--- a/cpp/src/arrow/util/logging.h
+++ b/cpp/src/arrow/util/logging.h
@@ -45,7 +45,7 @@ namespace arrow {
 #define ARROW_CHECK(condition)                           \
   (condition) ? 0                                        \
               : ::arrow::internal::FatalLog(ARROW_FATAL) \
-                    << __FILE__ << __LINE__ << " Check failed: " #condition
" "
+                    << __FILE__ << ":" << __LINE__ << " Check failed:
" #condition " "
 
 #ifdef NDEBUG
 #define ARROW_DFATAL ARROW_WARNING

http://git-wip-us.apache.org/repos/asf/arrow/blob/811e668e/cpp/src/plasma/client.cc
----------------------------------------------------------------------
diff --git a/cpp/src/plasma/client.cc b/cpp/src/plasma/client.cc
index 3f99fe0..e57a2a6 100644
--- a/cpp/src/plasma/client.cc
+++ b/cpp/src/plasma/client.cc
@@ -356,6 +356,7 @@ Status PlasmaClient::Contains(const ObjectID& object_id, bool* has_object)
{
     std::vector<uint8_t> buffer;
     RETURN_NOT_OK(PlasmaReceive(store_conn_, MessageType_PlasmaContainsReply, &buffer));
     ObjectID object_id2;
+    DCHECK_GT(buffer.size(), 0);
     RETURN_NOT_OK(
         ReadContainsReply(buffer.data(), buffer.size(), &object_id2, has_object));
   }

http://git-wip-us.apache.org/repos/asf/arrow/blob/811e668e/cpp/src/plasma/fling.cc
----------------------------------------------------------------------
diff --git a/cpp/src/plasma/fling.cc b/cpp/src/plasma/fling.cc
index 14db320..b84648b 100644
--- a/cpp/src/plasma/fling.cc
+++ b/cpp/src/plasma/fling.cc
@@ -37,6 +37,9 @@ int send_fd(int conn, int fd) {
   init_msg(&msg, &iov, buf, sizeof(buf));
 
   struct cmsghdr* header = CMSG_FIRSTHDR(&msg);
+  if (header == nullptr) {
+    return -1;
+  }
   header->cmsg_level = SOL_SOCKET;
   header->cmsg_type = SCM_RIGHTS;
   header->cmsg_len = CMSG_LEN(sizeof(int));

http://git-wip-us.apache.org/repos/asf/arrow/blob/811e668e/cpp/src/plasma/io.cc
----------------------------------------------------------------------
diff --git a/cpp/src/plasma/io.cc b/cpp/src/plasma/io.cc
index fc0010b..afe7053 100644
--- a/cpp/src/plasma/io.cc
+++ b/cpp/src/plasma/io.cc
@@ -224,6 +224,7 @@ uint8_t* read_message_async(int sock) {
   uint8_t* message = reinterpret_cast<uint8_t*>(malloc(size));
   s = ReadBytes(sock, message, size);
   if (!s.ok()) {
+    free(message);
     /* The other side has closed the socket. */
     ARROW_LOG(DEBUG) << "Socket has been closed, or some other error has occurred.";
     close(sock);

http://git-wip-us.apache.org/repos/asf/arrow/blob/811e668e/cpp/src/plasma/store.cc
----------------------------------------------------------------------
diff --git a/cpp/src/plasma/store.cc b/cpp/src/plasma/store.cc
index 72d199b..210cce1 100644
--- a/cpp/src/plasma/store.cc
+++ b/cpp/src/plasma/store.cc
@@ -665,7 +665,7 @@ class PlasmaStoreRunner {
   std::unique_ptr<PlasmaStore> store_;
 };
 
-static PlasmaStoreRunner* g_runner = nullptr;
+static std::unique_ptr<PlasmaStoreRunner> g_runner = nullptr;
 
 void HandleSignal(int signal) {
   if (signal == SIGTERM) {
@@ -683,10 +683,9 @@ void start_server(char* socket_name, int64_t system_memory, std::string
plasma_d
   // to a client that has already died, the store could die.
   signal(SIGPIPE, SIG_IGN);
 
-  PlasmaStoreRunner runner;
-  g_runner = &runner;
+  g_runner.reset(new PlasmaStoreRunner());
   signal(SIGTERM, HandleSignal);
-  runner.Start(socket_name, system_memory, plasma_directory, hugepages_enabled);
+  g_runner->Start(socket_name, system_memory, plasma_directory, hugepages_enabled);
 }
 
 }  // namespace plasma


Mime
View raw message