carbondata-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From watermen <>
Subject [GitHub] carbondata pull request #978: Cover the case when last page is not be consum...
Date Wed, 31 May 2017 09:40:13 GMT
GitHub user watermen opened a pull request:

    Cover the case when last page is not be consumed at the end

    First, we use Producer-Consumer model in the write step, we have n(default value is 2
and can be configured) producers and one consumer. The task of generate last page(less than
32000) is added to thread pool at the end, but can't be guaranteed to be finished and add
to BlockletDataHolder at the end. Because we have n tasks running concurrently.
    Second, we have 2 ways to invoke `writeDataToFile`, one is the size of `DataWriterHolder`
reach the size of blocklet and two is the page is the last page.
    So if the last page is not be consumed at the end, we lost the page which be consumed
after last page.
    This PR add a flag named isLastPageWrited to make sure every page is writed.

You can merge this pull request into a Git repository by running:

    $ git pull CARBONDATA-1109

Alternatively you can review and apply these changes as the patch at:

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #978
commit 6e34389cbb011734078d8c2431065d1f04fc891f
Author: Yadong Qi <>
Date:   2017-05-31T09:35:45Z

    Cover the case when last page is not be consumed at the end.


If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at or file a JIRA ticket
with INFRA.

View raw message