drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From vdiravka <...@git.apache.org>
Subject [GitHub] drill issue #846: DRILL-5544: Out of heap running CTAS against text delimite...
Date Fri, 09 Jun 2017 20:02:09 GMT
Github user vdiravka commented on the issue:

    @paul-rogers As I mentioned in my previous comment the page size can't greatly exceed
1Mb (default value of page-size option in Drill). And I checked it -- almost every time the
page size is much less than 1 MB.
    The data, which are buffered - all pages within one row group. And when buffered data
exceeds the block-size then the row group will be written to the disk and flushed from the
stream buffer.
    Which is what the current code does.
    I compared of creating a large parquet tables with current Drill master version and version
of Drill with my fix and received the same performance. Also I found the same time of passing
the Drill's tests
    The branch is rebased to the last Drill master version. 

If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.

View raw message