carbondata-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhatchayani (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CARBONDATA-1896) Clean files operation improvement
Date Thu, 14 Dec 2017 15:25:00 GMT

     [ https://issues.apache.org/jira/browse/CARBONDATA-1896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

dhatchayani updated CARBONDATA-1896:
------------------------------------
    Description: 
+*Problem:*+
When bringing up the session, clean operation is handled in a way to mark all the INSERT_OVERWRITE_IN_PROGRESS
or INSERT_IN_PROGRESS segments to MARKED_FOR_DELETE in tablestatus file. This clean operation
is not considering the other parallel sessions. If any other session's data load is IN_PROGRESS
at the time of bringing up one session, then the executing load also will be changed to MARKED_FOR_DELETE
irrespective of the actual load status. Handling stale segments cleaning while session bring
up also increases the time of bringing up a session.


+*Solution:*+
SEGMENT_LOCK should be taken on the new segment while loading.
While cleaning segments tablestatus file and SEGMENT_LOCK should be considered.
Cleaning stale files while bringing up the session should be removed and this should be manually
done on the needed tables through already existing CLEAN FILES DDL.

*Impact analysis on the solution will be updated soon.*















  was:
+*Problem:*+
When bringing up the session, clean operation is handled in a way to mark all the INSERT_OVERWRITE_IN_PROGRESS
or INSERT_IN_PROGRESS segments to MARKED_FOR_DELETE in tablestatus file. This clean operation
is not considering the other parallel sessions. If any other session's data load is IN_PROGRESS
at the time of bringing up one session, then the executing load also will be changed to MARKED_FOR_DELETE
irrespective of the actual load status. Handling stale segments cleaning while session bring
up also increases the time of bringing up a session.


+*Solution:*+
SEGMENT_LOCK should be taken on the new segment while loading.
While cleaning segments tablestatus file and SEGMENT_LOCK should be considered.
Cleaning stale files while bringing up the session should be removed and this should be manually
done on the needed tables through already exists CLEAN FILES DDL.

*Impact analysis on the solution will be updated soon.*
















> Clean files operation improvement
> ---------------------------------
>
>                 Key: CARBONDATA-1896
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-1896
>             Project: CarbonData
>          Issue Type: Bug
>            Reporter: dhatchayani
>            Assignee: dhatchayani
>
> +*Problem:*+
> When bringing up the session, clean operation is handled in a way to mark all the INSERT_OVERWRITE_IN_PROGRESS
or INSERT_IN_PROGRESS segments to MARKED_FOR_DELETE in tablestatus file. This clean operation
is not considering the other parallel sessions. If any other session's data load is IN_PROGRESS
at the time of bringing up one session, then the executing load also will be changed to MARKED_FOR_DELETE
irrespective of the actual load status. Handling stale segments cleaning while session bring
up also increases the time of bringing up a session.
> +*Solution:*+
> SEGMENT_LOCK should be taken on the new segment while loading.
> While cleaning segments tablestatus file and SEGMENT_LOCK should be considered.
> Cleaning stale files while bringing up the session should be removed and this should
be manually done on the needed tables through already existing CLEAN FILES DDL.
> *Impact analysis on the solution will be updated soon.*



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message