tajo-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TAJO-1486) Tajo should be able to skip header and footer rows when creating external table
Date Mon, 20 Jul 2015 09:40:04 GMT

    [ https://issues.apache.org/jira/browse/TAJO-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14633226#comment-14633226
] 

ASF GitHub Bot commented on TAJO-1486:
--------------------------------------

Github user jinossy commented on a diff in the pull request:

    https://github.com/apache/tajo/pull/615#discussion_r34979963
  
    --- Diff: tajo-storage/tajo-storage-hdfs/src/main/java/org/apache/tajo/storage/text/ByteBufLineReader.java
---
    @@ -197,6 +197,6 @@ public ByteBuf readLineBuf(AtomicInteger reads) throws IOException
{
           }
         }
         reads.set(readBytes);
    -    return buffer.slice(startIndex, readBytes - newlineLength);
    +    return buffer.slice(startIndex, readBytes - newlineLength).retain();
    --- End diff --
    
    This buffer is shared until closing the ByteBufLineReader. if you want to   keep the sliced
buffer, you must copy to new buffer


> Tajo should be able to skip header and footer rows when creating external table
> -------------------------------------------------------------------------------
>
>                 Key: TAJO-1486
>                 URL: https://issues.apache.org/jira/browse/TAJO-1486
>             Project: Tajo
>          Issue Type: Improvement
>    Affects Versions: 0.10.0
>            Reporter: Youngkyong Ko
>            Assignee: Jongyoung Park
>            Priority: Minor
>             Fix For: 0.11.0
>
>         Attachments: TAJO-1486-1.patch, TAJO-1486.patch
>
>
> It is quite common to see header/footer lines in real world data set. So skipping first/last
N lines in "create external table" DDL can be useful feature for Tajo users.  In this way,
user don't need additional processing of data which generated by other application with a
header or footer and directly use the file for table operations.
> cf. Same feature added in Hive 0.13 : https://issues.apache.org/jira/browse/HIVE-5795



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message