carbondata-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CARBONDATA-260) Equal or lesser value of MAXCOLUMNS option than column count in CSV header results into array index of bound exception
Date Tue, 20 Sep 2016 14:33:21 GMT

    [ https://issues.apache.org/jira/browse/CARBONDATA-260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15506741#comment-15506741
] 

ASF GitHub Bot commented on CARBONDATA-260:
-------------------------------------------

GitHub user manishgupta88 opened a pull request:

    https://github.com/apache/incubator-carbondata/pull/180

    [CARBONDATA-260] Equal or lesser value of MAXCOLUMNS option than column count in CSV header
results into array index of bound exception

    Problem: Equal or lesser value of MAXCOLUMNS option than column count in CSV header results
into array index of bound exception
    
    Analysis: If column count in CSV header is more or equal to MAXCOLUMNS option value then
array index out of bound exception is thrown by the Univocity CSV parser. This is because
while parsing the row, parser adds each row to an array and increments the index and after
incrementing it performs one more operation using the incremented index value which leads
to array index pf bound exception. Code snipped as attached below for CSV parser.
    
    public void valueParsed() {
    	this.parsedValues[column++] = appender.getAndReset();
    	this.appender = appenders[column];
    }
    
    e.g. In the above code if column value is 7 then array index will be from 0-6 and when
column value becomes 6 then in the second line ArrayIndexOutOfBoundException will be thrown
as column value will become 7.
    
    Fix: Whenever Column count in CSV header is equal or more than MAXCOLUMNS option value
or default value, increment it by 1.
    
    Impact: Data load flow


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/manishgupta88/incubator-carbondata maxcolumns_array_indexOfBound

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-carbondata/pull/180.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #180
    
----
commit 3f32424e55615c8e45470d5169b817f9f703dc3e
Author: manishgupta88 <tomanishgupta18@gmail.com>
Date:   2016-09-20T14:21:33Z

    Problem: Equal or lesser value of MAXCOLUMNS option than column count in CSV header results
into array index of bound exception
    
    Analysis: If column count in CSV header is more or equal to MAXCOLUMNS option value then
array index out of bound exception is thrown by the Univocity CSV parser. This is because
while parsing the row, parser adds each row to an array and increments the index and after
incrementing it performs one more operation using the incremented index value which leads
to array index pf bound exception. Code snipped as attached below for CSV parser.
    
    public void valueParsed() {
    	this.parsedValues[column++] = appender.getAndReset();
    	this.appender = appenders[column];
    }
    
    Fix: Whenever Column count in CSV header is equal or more than MAXCOLUMNS option value
or default value, increment it by 1.
    
    Impact: Data load flow

----


> Equal or lesser value of MAXCOLUMNS option than column count in CSV header results into
array index of bound exception
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: CARBONDATA-260
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-260
>             Project: CarbonData
>          Issue Type: Bug
>            Reporter: Manish Gupta
>            Assignee: Manish Gupta
>
> If column count in CSV header is more or equal to MAXCOLUMNS option value then array
index out of bound exception is thrown by the Univocity CSV parser. This is because while
parsing the row, parser adds each row to an array and increments the index and after incrementing
it performs one more operation using the incremented index value which leads to array index
pf bound exception
> java.lang.OutOfMemoryError: Java heap space
> at com.univocity.parsers.common.ParserOutput.<init>(ParserOutput.java:86)
> at com.univocity.parsers.common.AbstractParser.<init>(AbstractParser.java:66)
> at com.univocity.parsers.csv.CsvParser.<init>(CsvParser.java:50)
> at org.apache.carbondata.processing.csvreaderstep.UnivocityCsvParser.initialize(UnivocityCsvParser.java:114)
> at org.apache.carbondata.processing.csvreaderstep.CsvInput.doProcessUnivocity(CsvInput.java:427)
> at org.apache.carbondata.processing.csvreaderstep.CsvInput.access$100(CsvInput.java:60)
> at org.apache.carbondata.processing.csvreaderstep.CsvInput$1.call(CsvInput.java:389)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message