drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From minji-kim <...@git.apache.org>
Subject [GitHub] drill pull request: DRILL-4411: hash join should limit batch based...
Date Thu, 18 Feb 2016 17:39:55 GMT
GitHub user minji-kim opened a pull request:


    DRILL-4411: hash join should limit batch based on size and number of records

    Right now, hash joins can run out of memory if records are large since the batch is limited
only by size (of 4000).  This patch implements a simple heuristic.  If the allocator for the
outputs become larger than 10 MB before outputing 4000 records (say 2000), then set the batch
size limit to 2000 for the future batches.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/minji-kim/drill DRILL-4411

Alternatively you can review and apply these changes as the patch at:


To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #381
commit 2e3b1c75273e1b87679d79bdc4f3877b72603e3c
Author: Minji Kim <minji@dremio.com>
Date:   2016-02-18T17:05:51Z

    DRILL-4411: hash join should limit batch based on size as well as number of records


If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.

View raw message