drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From paul-rogers <...@git.apache.org>
Subject [GitHub] drill pull request #593: DRILL-3178 csv reader should allow newlines inside ...
Date Thu, 06 Oct 2016 23:15:55 GMT
Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/593#discussion_r82303401
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/text/compliant/TextReader.java
---
    @@ -231,33 +231,34 @@ private void parseQuotedValue(byte prev) throws IOException {
         final TextInput input = this.input;
         final byte quote = this.quote;
     
    -    ch = input.nextChar();
    +    try {
    +      input.setMonitorForNewLine(false);
    --- End diff --
    
    Seems an overly complex way to do the parsing. Is there any reason we want to capture
the original newline character rather than the normalized one?
    
    If we need to capture the original one, then a cleaner way to do that is to keep track
of the start & end position of the current token (character), and provide a method to
return that block as a string. Then, scan for a close quote, reading characters & special-casing
any newlines.
    
    If we want to include newlines in quoted strings sometimes, but not other times, then
the check logic can be a bit more complex.
    
    But, the proposed solution of making newlines not be newlines seems a bit odd...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message