drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacques Nadeau <jacq...@apache.org>
Subject Re: [DISCUSSION] should BufferAllocator.buffer() throw an OutOfMemoryException ?
Date Tue, 30 Jun 2015 22:19:30 GMT
Lots of different comments on this thread.  Here are few of my thoughts:

I think you need to differentiate the different tiers of interfaces a
little bit more.

BufferAllocator.buffer() was meant to be a low level advanced interface.
It should be used rarely and when done so, only with extreme care.  Most
devs shouldn't use it.  Having an advanced interface seems fine here.  This
was how the interface was envisioned.   It looks like there are currently
16 classes that reference this code not including vectors.  (There probably
should be less, it looks like some code duplication contributes to this.)

All the vector classes also access this code.  Vectors have two methods:
allocateNew and allocateNewSafe.  allocateNew has an exception behavior and
allocateNewSafe has a return value behavior.  These are main APIs that are
used extensively throughout the code and the ones we should be most focused
on.  I like giving the developer the choice of behavior (thus the two
options).  It sounds like others are saying to remove this choice (because
code is not managing it correctly).

If there are bugs in the 16 classes on how they use the advanced interface,
this doesn't seem like a big challenge to correct rather than bulk
modifying the API.  If people are using the wrong method on the vector
classes (or the vector classes aren't correctly behaving), that doesn't
seem like something we should address at the allocator API level.

Hakim, can you be more specific about your concern?  It doesn't seem like
we're talking about thousands of places here.

Regarding checked versus unchecked...  I'm strongly in support of using
checked exceptions for most things.  However, I'm against it for an out of
memory condition.  (Note how the JVM also treats this as an unchecked
exception.) It causes early catching when, in most cases, we really want to
bubble way up the stack and fail the query.  I know of very few places
where we can actually locally manage an out of memory condition so why
write huge amounts of code the bubble the exception all the way to where we
can actually handle it.

A quick sidebar to Jason's "Hash agg out of memory" comment
Two thoughts here.  I'm not sure this is a meaningful message.  Where we
run out of memory may have nothing to do with where we are using most
memory.  And if we did to decide this was useful, this can be done just as
easily with unchecked exceptions as checked exception.  What I'm want to
avoid is adding throws OutOfMemoryException in 1000 places.  The reality is
everybody should assume that an OutOfMemory can occur pretty much anytime.

TL;DR

In general, I think this discussion is too abstract and thus has the danger
of trending towards a religious discussion (I already see a bit of that
fervor in some of the responses.).  The number of places we are talking
about are not that large and are easy to find.  Let's solve this by looking
at specifics.  For example, if people think checked would be better,
someone should go through and make an example changeset for everyone to
review.  My guess is, the only that it works will be if very quickly, the
handling code converts the checked exception into an unchecked
UserException.  But I'm more than happy to be proven wrong.

On Tue, Jun 30, 2015 at 2:16 PM, Daniel Barclay <dbarclay@maprtech.com>
wrote:

> Hanifi GUNES wrote:
>
>> - We would end up having to add it to almost everything everywhere
>> Why would one propagate the checked exception for no reason? And why would
>> one not propagate the exception for a good reason like robustness? I agree
>> that one has to avoid propagating the checked exception for no reason
>> however I support propagating it for a good reason.
>>
>> The added benefit of raising a checked exception is reminding as well as
>> enforcing devs to handle it and be more cautious about this particular
>> event. I find this compile-time safety check invaluable for robustness.
>>
>
> +(1 times <some large number>)
>
> Daniel
>
>
>
>>
>> - Or constantly wrapping it with RuntimeException to get around that
>> If it has to be done, I would recommend relying on a helper to do so.
>> There
>> is not much of man-work involved here.
>>
>>
>> 2015-06-30 13:53 GMT-07:00 Abdel Hakim Deneche <adeneche@maprtech.com>:
>>
>>  +1 to Hanifi's
>>>
>>> On Tue, Jun 30, 2015 at 1:38 PM, Jason Altekruse <
>>> altekrusejason@gmail.com
>>>
>>>>
>>>>  wrote:
>>>
>>>  +1 to Hanifi's comments, I think it makes much more sense to have a
>>>>
>>> number
>>>
>>>> of sites where the operators are explicitly catching a checked OOM
>>>> exception and either decide to handle it or produce a message like "Hash
>>>> Aggregate does not support our of memory conditions". This would be
>>>> particularly useful for debugging queries, as the user exception can
>>>> provide context information about the current operation. This way users
>>>>
>>> can
>>>
>>>> have some idea about the part of their query that might be causing an
>>>> excessive strain on system resources. I understand that there are also
>>>> cases where operators competing for memory can make it a toss up to
>>>> which
>>>> will actually fail, but this would at least be a step to give more
>>>>
>>> detailed
>>>
>>>> information to users.
>>>>
>>>> On Tue, Jun 30, 2015 at 1:28 PM, Hanifi GUNES <hg@apache.org> wrote:
>>>>
>>>>  I would propose throwing a checked exception encouraging explicit and
>>>>> consistent handling of this event. Each sub-system has liberty to
>>>>>
>>>> decide
>>>
>>>> if
>>>>
>>>>> an OOM failure is fatal or non-fatal depending on its capabilities.
>>>>>
>>>> Also
>>>
>>>> if
>>>>
>>>>> at some point a sub-system needs to communicate with its callers via
a
>>>>> different mechanism such like using flags (boolean, enum etc) or
>>>>>
>>>> raising
>>>
>>>> an
>>>>
>>>>> unchecked exception that's still fine, just handle the exception. If
>>>>>
>>>> there
>>>>
>>>>> is a need to suppress the checked exception that's fine too, just use
a
>>>>> helper method.
>>>>>
>>>>> Either way, returning *null* sounds problematic in many ways i) it is
>>>>> implicit ii) unsafe iii) its handling logic is repetitive iv) it is
>>>>> semantically unclean to make null mean something - even worse something
>>>>> context specific.
>>>>>
>>>>>
>>>>> -Hanifi
>>>>>
>>>>> 2015-06-30 12:23 GMT-07:00 Abdel Hakim Deneche <adeneche@maprtech.com
>>>>>
>>>> :
>>>>
>>>>>
>>>>>  I guess that would fix the issue too. But we may still run into
>>>>>>
>>>>> situations
>>>>>
>>>>>> where the caller will pass a flag to "mute" the exception and not
>>>>>>
>>>>> handle
>>>>
>>>>> the case anyway.
>>>>>>
>>>>>> If .buffer() unconditionally throws an exception, can't the caller,
>>>>>>
>>>>> who
>>>
>>>> wants to, just catch that and handle it properly ?
>>>>>>
>>>>>> On Tue, Jun 30, 2015 at 12:13 PM, Chris Westin <
>>>>>>
>>>>> chriswestin42@gmail.com>
>>>>
>>>>> wrote:
>>>>>>
>>>>>>  No, but we should do something close to that.
>>>>>>>
>>>>>>> There are cases where the caller can handle the inability to
get
>>>>>>>
>>>>>> more
>>>
>>>> memory, and may be able to go to disk. However, you are correct
>>>>>>>
>>>>>> that
>>>
>>>> there
>>>>>>
>>>>>>> are many that can't handle an OOM, and that fail to check.
>>>>>>>
>>>>>>> Instead of unconditionally throwing OutOfMemoryRuntimeException,
I
>>>>>>>
>>>>>> would
>>>>>
>>>>>> suggest that the buffer() call take a flag that indicates whether
>>>>>>>
>>>>>> or
>>>
>>>> not
>>>>>
>>>>>> it
>>>>>>
>>>>>>> should throw if it is unable to fulfill the request. This way,
the
>>>>>>>
>>>>>> call
>>>>
>>>>> sites that can handle an OOM can pass in the flag to return null,
>>>>>>>
>>>>>> and
>>>
>>>> the
>>>>>
>>>>>> rest can pass in the flag value to throw, and not have to have any
>>>>>>>
>>>>>> checking
>>>>>>
>>>>>>> code.
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jun 30, 2015 at 12:06 PM, Abdel Hakim Deneche <
>>>>>>> adeneche@maprtech.com
>>>>>>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>
>>>>>>>  our current implementations of BufferAllocator.buffer(int, int)
>>>>>>>>
>>>>>>> returns
>>>>>
>>>>>> null when it cannot allocate the buffer.
>>>>>>>>
>>>>>>>> But looking through the code, there are many places that
don't
>>>>>>>>
>>>>>>> check
>>>>
>>>>> if
>>>>>
>>>>>> the
>>>>>>>
>>>>>>>> allocated buffer is null before trying to access it which
will
>>>>>>>>
>>>>>>> throw
>>>>
>>>>> a
>>>>>
>>>>>> NullPointerException.
>>>>>>>>
>>>>>>>> ValueVectors' allocateNewSafe() seem to be the only place
that
>>>>>>>>
>>>>>>> handle
>>>>
>>>>> the
>>>>>>
>>>>>>> null in a specific manner.
>>>>>>>>
>>>>>>>> Should we update the allocators' implementation to throw
an
>>>>>>>> OutOfMemoryRuntimeException instead of returning null ? this
has
>>>>>>>>
>>>>>>> the
>>>>
>>>>> added
>>>>>>>
>>>>>>>> benefit of displaying a proper out of memory error message
to the
>>>>>>>>
>>>>>>> user.
>>>>>
>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> Abdelhakim Deneche
>>>>>>>>
>>>>>>>> Software Engineer
>>>>>>>>
>>>>>>>>    <http://www.mapr.com/>
>>>>>>>>
>>>>>>>>
>>>>>>>> Now Available - Free Hadoop On-Demand Training
>>>>>>>> <
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
>>>
>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Abdelhakim Deneche
>>>>>>
>>>>>> Software Engineer
>>>>>>
>>>>>>    <http://www.mapr.com/>
>>>>>>
>>>>>>
>>>>>> Now Available - Free Hadoop On-Demand Training
>>>>>> <
>>>>>>
>>>>>>
>>>>>
>>>>
>>> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
>>>
>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> Abdelhakim Deneche
>>>
>>> Software Engineer
>>>
>>>    <http://www.mapr.com/>
>>>
>>>
>>> Now Available - Free Hadoop On-Demand Training
>>> <
>>>
>>> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
>>>
>>>>
>>>>
>>>
>>
>
> --
> Daniel Barclay
> MapR Technologies
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message