carbondata-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dhatchayani <...@git.apache.org>
Subject [GitHub] carbondata pull request #1702: [CARBONDATA-1896] Clean files operation impro...
Date Wed, 27 Dec 2017 04:24:21 GMT
GitHub user dhatchayani reopened a pull request:

    https://github.com/apache/carbondata/pull/1702

    [CARBONDATA-1896] Clean files operation improvement

    **Problem:**
    When bringing up the session, clean operation is handled in a way to mark all the INSERT_OVERWRITE_IN_PROGRESS
or INSERT_IN_PROGRESS segments to MARKED_FOR_DELETE in tablestatus file. This clean operation
is not considering the other parallel sessions. If any other session's data load is IN_PROGRESS
at the time of bringing up one session, then the executing load also will be changed to MARKED_FOR_DELETE
irrespective of the actual load status. Handling stale segments cleaning while session bring
up also increases the time of bringing up a session.
    
    **Solution:**
    SEGMENT_LOCK should be taken on the new segment while loading.
    While cleaning segments tablestatus file and SEGMENT_LOCK should be considered.
    Cleaning stale files while bringing up the session should be removed and this can be either
manually done on the needed tables through already existing CLEAN FILES DDL or the next load
on the table will clean the same.
    
    
    
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
    
     - [x] Testing done
            Manual Testing
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella
JIRA. 
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dhatchayani/incubator-carbondata clean_files

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/1702.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1702
    
----
commit 4573f5fbcc7d0414323513e8746f9050f9eb1e78
Author: dhatchayani <dhatcha.official@...>
Date:   2017-12-20T17:05:31Z

    [CARBONDATA-1896] Clean files operation improvement

----


---

Mime
View raw message