drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-5512) Standardize error handling in ScanBatch
Date Sun, 21 May 2017 04:04:04 GMT

    [ https://issues.apache.org/jira/browse/DRILL-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16018684#comment-16018684
] 

ASF GitHub Bot commented on DRILL-5512:
---------------------------------------

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/838#discussion_r117622601
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java
---
    @@ -173,9 +174,8 @@ public IterOutcome next() {
     
             currentReader.allocate(mutator.fieldVectorMap());
           } catch (OutOfMemoryException e) {
    -        logger.debug("Caught Out of Memory Exception", e);
             clearFieldVectorMap();
    -        return IterOutcome.OUT_OF_MEMORY;
    +        throw UserException.memoryError(e).build(logger);
    --- End diff --
    
    You raise two good points.
    
    The first is that Drill tends to use unchecked exceptions because, well, they are convenient
and Drill must handle such unchecked exceptions as NPE, illegal state, illegal argument and
so on.
    
    But, normal Java practice is to declare exceptions so that callers know what they should
handle. See DRILL-5386 asks to rationalize exception use. Comments in that ticket from project
veterans shows some degree of resistance to the idea of checked exceptions. So, yes we must
expect any unchecked exception from any method. This is why operators should handle all exceptions,
and why we need code to sort out exceptions based on type.
    
    The analysis of OOM is correct, but omits context. It is seldom (never?) that a sort sits
directly above a scan. Seems most often there is a filter or project between them. If the
scan hits OOM, it is not likely because it has exhausted the memory limit on the scan operator:
each operator defaults to 10 GB limit. Instead, it is likely that overall system memory is
exhausted. So, what is likely to happen? Each operator between the scan and sort must handle
the OUT_OF_MEMORY status by bubbling it up the call stack. Let's assume that works.
    
    The sort now wants to spill. Spilling is an activity that requires memory to perform.
Spilling requires a merge phase to combine the records from buffered batches in sort order
so that the spilled run is sorted. That is, the sort must allocate a batch, often many MB
in size. (Spilled runs must be sizable so we can limit the number of spilled runs merged in
the final merge phase.)
    
    So, the sort tries to allocate vectors for the merge batch and... The allocation fails.
Why? Because we are out of system memory -- that's why the scanner triggered an OOM.
    
    I can find no code that sets up this out-of-system-memory condition to verify that existing
code works. I think we were taking it on faith that this behavior actually works.
    
    Moving forward, we are working towards a workable solution. Assign the scan some amount
of memory, and limit batches to fit within that memory. Give the sort a certain amount of
memory, and have it manage within that memory so that when a spill occurs, the sort has sufficient
memory to create the required merge batches as part of the spill.
    
    Finally, let's consider the case you outlined: the scan fails with OOM on the initial
allocation. The initial allocation is often small; the scan goes through many vector doublings
to read the full complement of rows. (At least, thats what I've observed empirically; perhaps
the original intent was different.) Let's say we tried to recover from an initial allocation
failure.
    
    Say we have a row with five columns. We allocate three, but fail on the fourth. Say the
fourth is a Varchar: has two vectors: offset and data. The current logic releases the partially-allocated
vectors, which is good. OUT_OF_MEMORY is returned and the vectors are reallocated if memory
could be released. Sounds good.
    
    But, most queries run in multiple threads. If one hits OOM, then the others probably will
as well. The actual system memory is a shared resource, but there is no coordination. A scan
might release its partially-allocated vectors so the sort can, in theory, spill. But, that
memory might immediately be grabbed by some other thread, resulting in a sort spill OOM. In
practice, however, the initial vector allocations are much smaller than the merge batch, so
it is only slightly useful to free up the initial allocation. That initial allocation, plus
luck that some other thread has freed enough memory, might allow us to spill. But it is a
crap-shoot.
    
    In short, even if this logic might possibly work in some scenarios in a single-threaded
query, it is too chaotic to work in general with many threads. And, of course no tests exist
for either case so we are just guessing.
    
    All-in-all, the above argues strongly that the path forward is to:
    
    1. Rationalize error handling: OOM errors cause query failure.
    2. Design a memory assignment system so that operators live within a budget.
    3. Design tests to ensure that the system works rather than relying on hope.
    
    This checkin is a step toward goal 1. The external sort revision, hash agg spilling and
other projects are steps toward goal 2. We continue to chip away at our ability to do goal
3.
    
    Given all of this, can you suggest how we could gather evidence that the current OUT_OF_MEMORY
status is actually working in any actual queries? Or, do we have a tough case of comparing
concrete changes against an aspiration for how the system work might?
    
    More practically, with the change, OOM will fail the query. Previously, there is some
chance that Drill might recover from an OOM. But, we have no tests and no statistics. Is it
worth risking that hypothetical for a concrete step in the right direction. I don't think
we have the answer and that, itself, is a problem. Thoughts?


> Standardize error handling in ScanBatch
> ---------------------------------------
>
>                 Key: DRILL-5512
>                 URL: https://issues.apache.org/jira/browse/DRILL-5512
>             Project: Apache Drill
>          Issue Type: Improvement
>    Affects Versions: 1.10.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Minor
>              Labels: ready-to-commit
>             Fix For: 1.10.0
>
>
> ScanBatch is the Drill operator executor that handles most readers. Like most Drill operators,
it uses an ad-hoc set of error detection and reporting methods that evolved over Drill development.
> This ticket asks to standardize on error handling as outlined in DRILL-5083. This basically
means reporting all errors as a {{UserException}} rather than using the {{IterOutcome.STOP}}
return status or using the {{FragmentContext.fail()}} method.
> This work requires the new error codes introduced in DRILL-5511, and is a step toward
making readers aware of vector size limits.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message