drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-951) CSV header row should be parsed
Date Tue, 03 Nov 2015 15:57:27 GMT

    [ https://issues.apache.org/jira/browse/DRILL-951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14987505#comment-14987505
] 

ASF GitHub Bot commented on DRILL-951:
--------------------------------------

Github user jacques-n commented on a diff in the pull request:

    https://github.com/apache/drill/pull/232#discussion_r43765441
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/text/compliant/CompliantTextRecordReader.java
---
    @@ -71,15 +79,14 @@ public CompliantTextRecordReader(FileSplit split, DrillFileSystem
dfs, FragmentC
       // checks to see if we are querying all columns(star) or individual columns
       @Override
       public boolean isStarQuery() {
    -    if(settings.isUseRepeatedVarChar()) {
    -      return super.isStarQuery() || Iterables.tryFind(getColumns(), new Predicate<SchemaPath>()
{
    -        @Override
    -        public boolean apply(@Nullable SchemaPath path) {
    -          return path.equals(RepeatedVarCharOutput.COLUMNS);
    -        }
    -      }).isPresent();
    -    }
    -    return super.isStarQuery();
    +    if (super.isStarQuery()) { return true; }
    --- End diff --
    
    If we're in header extraction mode, requesting the columns column shouldn't mean a request
for star query. Only a * should.


> CSV header row should be parsed
> -------------------------------
>
>                 Key: DRILL-951
>                 URL: https://issues.apache.org/jira/browse/DRILL-951
>             Project: Apache Drill
>          Issue Type: New Feature
>          Components: Storage - Text & CSV
>            Reporter: Tomer Shiran
>            Assignee: Abhijit Pol
>             Fix For: Future
>
>
> CSV reader is currently treating header names like regular rows. There should be a way
to treat the header row as the column names (optional?).
> I exported this dataset to a CSV: https://data.sfgov.org/Public-Safety/SFPD-Incidents-Previous-Three-Months/tmnf-yvry



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message