carbondata-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CARBONDATA-276) Add trim option
Date Wed, 12 Oct 2016 10:28:20 GMT

    [ https://issues.apache.org/jira/browse/CARBONDATA-276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568332#comment-15568332
] 

ASF GitHub Bot commented on CARBONDATA-276:
-------------------------------------------

Github user sujith71955 commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/200#discussion_r82977592
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/csvreaderstep/UnivocityCsvParser.java
---
    @@ -102,8 +102,8 @@ public void initialize() throws IOException {
         parserSettings.setMaxColumns(
             getMaxColumnsForParsing(csvParserVo.getNumberOfColumns(), csvParserVo.getMaxColumns()));
         parserSettings.setNullValue("");
    -    parserSettings.setIgnoreLeadingWhitespaces(false);
    -    parserSettings.setIgnoreTrailingWhitespaces(false);
    +    parserSettings.setIgnoreLeadingWhitespaces(csvParserVo.getTrim());
    --- End diff --
    
    pros of this approach will be suppose in one load user loaded with dirty data and suddenly
he realizes no i need to trim then in the next load he will enable the option and load the
data, this will increase the dictionary space also, also in query dictionary lookup overhead
will increase.


> Add trim option
> ---------------
>
>                 Key: CARBONDATA-276
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-276
>             Project: CarbonData
>          Issue Type: Bug
>            Reporter: Lionx
>            Assignee: Lionx
>            Priority: Minor
>
> Fix a bug and add trim option.
> Bug: When string is contains LeadingWhiteSpace or TrailingWhiteSpace, query result is
null. This is because the dictionary ignore the LeadingWhiteSpace and TrailingWhiteSpace and
the csvInput dose not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message