drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From paul-rogers <...@git.apache.org>
Subject [GitHub] drill issue #958: DRILL-5808: Reduce memory allocator strictness for "manage...
Date Sun, 01 Oct 2017 01:33:23 GMT
Github user paul-rogers commented on the issue:

    https://github.com/apache/drill/pull/958
  
    @Ben-Zvi, you are right that, in the worst case, this change will allow operators to exceed
the memory allotment. But, that is actually the purpose.
    
    As we know, it is *very* difficult to get memory management just right at present due
to the wildly varying memory layouts for vectors, power-of-two rounding of buffer sizes, unexpected
doubling of vectors, and lack of control over the size of incoming batches. We'd love to fix
these, but doing so will take time.
    
    In the meanwhile, we have the choice of failing queries because the calcs are off by a
bit, or being more flexible and letting queries succeed at the risk of running out of memory.
The change here does log each "excess" allocation so we can find them and fix any remaining
issues. Also, in a test environment, strict limits are enforced to find bugs.
    
    All of this is set against the backdrop of the exchange operators, hash join, and other
operators that have an unlimited appetite for memory. Until we reign in those operators, seems
silly to kill user queries because those operators that *do* manage memory make a small mistake
here or there.
    
    Once all operators are under control, and Drill's internal memory allocation is under
better control, we can back out this change and be much more strict about enforcing memory
limits.
    
    Bottom line: should we fail user queries because of remaining rough spots in the "managed"
operators? Or, should we allow user queries to succeed at a very small additional risk of
running out of memory?


---

Mime
View raw message