drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Hsuan-Yi Chu" <hsua...@usc.edu>
Subject Re: Review Request 37893: DRILL-3718: TSV reader fails when "" appears
Date Fri, 28 Aug 2015 21:35:19 GMT


> On Aug. 28, 2015, 8:54 p.m., Jacques Nadeau wrote:
> > -1 This code is tab specific.  This should be corrected to be generic.

Not sure what you meant by "generic"

Did you mean the tab can be replaced with other delimiters such as '|', ',' etc?


If that speculation is correct, please see the comment above that if-statement (where I added
additional constraint) 


The logic surrounded in the if-statement is used to ignore whitespaces


And the second condition (i.e., ch <= ' ') would force ignoring '\t'


However, it is not correct because if your delimiter is '\t' (tsv files), '\t' should not
be ignored.

(I found some insufficiency in the code so please read the new patch)


- Sean Hsuan-Yi


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37893/#review96929
-----------------------------------------------------------


On Aug. 28, 2015, 8:47 p.m., Sean Hsuan-Yi Chu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/37893/
> -----------------------------------------------------------
> 
> (Updated Aug. 28, 2015, 8:47 p.m.)
> 
> 
> Review request for drill, Jacques Nadeau and Mehant Baid.
> 
> 
> Bugs: DRILL-3718
>     https://issues.apache.org/jira/browse/DRILL-3718
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> For TSV files, if the TextReader reads a double quote, it would keep scanning until it
gets the second double quote.
> 
> However, even getting the second double quote, the current reader will keep going in
order to trim the space (i.e., ' '). 
> 
> In tsv, there is no need to trim '\t' (tab), which is used to separate fields.
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/text/compliant/TextReader.java
3899509 
>   exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java 6b74ecf 
>   exec/java-exec/src/test/resources/store/text/WithQuote.tsv PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/37893/diff/
> 
> 
> Testing
> -------
> 
> All
> 
> 
> Thanks,
> 
> Sean Hsuan-Yi Chu
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message