carbondata-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Manish Gupta (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CARBONDATA-1213) Removed rowCountPercentage check and fixed IUD data load issue
Date Thu, 22 Jun 2017 09:25:00 GMT
Manish Gupta created CARBONDATA-1213:
----------------------------------------

             Summary: Removed rowCountPercentage check and fixed IUD data load issue
                 Key: CARBONDATA-1213
                 URL: https://issues.apache.org/jira/browse/CARBONDATA-1213
             Project: CarbonData
          Issue Type: Bug
            Reporter: Manish Gupta
            Assignee: Manish Gupta
             Fix For: 1.2.0


Problems:
1. Row count percentage not required with high cardinality threshold check
2. IUD returning incorrect results in case of update on high cardinality column

Analysis:
1. In case a column is identified as high cardinality column still it is not getting converted
to no dictionary column because of another parameter check called rowCountPercentage. Default
value of rowCountPercentage is 80%. Due to this even though high cardinality column is identified,
if it is less than 80% of the total number of rows it will be treated as dictionary column.
This can still lead to executor lost failure due to memory constraints.
2. RLE on a column is not being set correctly and due to incorrect code design RLE applicable
on a column is decided by a different part of code from the one which is actually applying
the RLE on a column. Because of this Footer is getting filled with incorrect RLE information
and query is failing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message