carbondata-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "kumar vishal (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (CARBONDATA-1224) Going out of memory if more segments are compacted at once in V3 format
Date Tue, 30 Jan 2018 16:44:00 GMT

     [ https://issues.apache.org/jira/browse/CARBONDATA-1224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

kumar vishal resolved CARBONDATA-1224.
--------------------------------------
       Resolution: Fixed
    Fix Version/s: 1.3.0

> Going out of memory if more segments are compacted at once in V3 format
> -----------------------------------------------------------------------
>
>                 Key: CARBONDATA-1224
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-1224
>             Project: CarbonData
>          Issue Type: Bug
>            Reporter: Ravindra Pesala
>            Assignee: Ravindra Pesala
>            Priority: Major
>             Fix For: 1.3.0
>
>          Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> In V3 format we read the whole blocklet at once to memory in order save IO time. But
it turns out  to be costlier in case of parallel reading of more carbondata files. 
> For example if we need to compact 50 segments then compactor need to open the readers
on all the 50 segments to do merge sort. But the memory consumption is too high if each reader
reads whole blocklet to the memory and there is high chances of going out of memory.
> Solution:
> In this type of scenarios we can introduce new readers for V3 to read the data page by
page instead of reading whole blocklet at once to reduce the memory footprint.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message