arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: [C++] How careful do we want to be about exceptions?
Date Tue, 07 Jun 2016 12:03:34 GMT
Hi Micah,

You're right - in Kudu at least we don't make attempts to recover from
out-of-memory errors like std::bad_alloc. So, if 'new' or any of the STL
libraries throw such an error, we'll just let it propagate up the stack and
crash the server. In my experience, trying to recover from such errors is
fraught with difficulty -- often you'd want to allocate in the recovery
path, and short of implementing "emergency allocators" and such, things
just get too hard to reason about and test.

In the context of an embeddable *C* library I do think there's some value
in eventually catching std::bad_alloc and returning error codes like
ENOMEM. However, I'd personally vote that to be given very low priority in
the general queue of features to be implemented, since I imagine that most
of the big data applications that would embed Arrow are likely to follow
the same "crash on alloc failure" anyway.

Another thing to note is that in any cases where Arrow does allocate large
amounts of memory (eg when reading an on-disk file or user data) I do think
some form of memory limiting is advisable. For example, even though Kudu
crashes on OOM, we closely track our memory usage and start to reject user
write requests after crossing certain thresholds. Providing enough hooks
for embedders of Arrow to do similar is a good idea.

-Todd

On Tue, May 31, 2016 at 7:33 AM, Micah Kornfield <emkornfield@gmail.com>
wrote:

> We are following the google style guide which prohibits exceptions [1].
> Several places in our code we use std::vector and std::make_shared
> which can potentially throw std::bad_alloc exceptions that we don't
> currently wrap in try/catch blocks.  These classes/objects are used
> for constructing/manipulating metadata (all errors for allocations of
> data are checked and returned via Status objects).
>
> In general, it seems unlikely that these code paths would get
> triggered, but there is still a possibility.
>
> Does anybody have any thoughts on how thorough we want to be about
> avoiding these types of edge cases?
>
> My thought is to leave the code as is and accept the small chance that
> these get hit.  The main rationale is that if these types of calls do
> raise std::bad_alloc, it is highly likely that the program using the
> Arrow library is likely in an unstable state already (most of the
> allocations should be less then 100 bytes or so).  A second point is
> that a cursory search of Kudu [2] and Impala [3] code bases  (both of
> which also intend to follow the style guide) turn up some similar
> calls that could throw std::bad_alloc (it is possible there is some
> other magic happening here to avoid these).
>
> This isn't completely satisfying though, so I'm happy to be convinced
> otherwise.
>
> Thanks,
> -Micah
>
>
> [1] https://google.github.io/styleguide/cppguide.html#Exceptions
> [2]
> https://github.com/cloudera/kudu/blob/7f3691a826b9d27199319409f8d721ec1d08a8ba/src/kudu/consensus/log_reader.cc#L74
> [3]
> https://github.com/cloudera/Impala/blob/a36dcfc0322e213c06d6cf8d3f330c4b06739523/be/src/common/object-pool.h
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message