tajo-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Chen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TAJO-711) Add Avro storage support
Date Mon, 07 Apr 2014 14:28:14 GMT

    [ https://issues.apache.org/jira/browse/TAJO-711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13961875#comment-13961875

David Chen commented on TAJO-711:

I have an initial implementation of this done and have posted an RB. There are still a few
changes and validation work I would like to do before this is fully ready:

 * Test the use of {{avro.schema.url}}. Currently, the tests only test {{avro.schema.literal}}.
 * Converting between Avro and Tajo is slightly tricky because data sets would usually use
the Avro schema as the "true" schema. I would like to do some more validation to look for
some more corner cases. For this ticket, I'll do a best-effort validation with flat schemas,
though Avro support might not be truly battle-tested until TAJO-710 is done because most of
our data here at LinkedIn have nested schemas.

Something else we would want to look at is schema evolution across partitions. I haven't looked
too closely at TAJO-283 yet, but are we storing table properties into the partitions? For
example, say that partitions i...j are created with Avro schema A, set by either the {{avro.schema.url}}
or {{avro.schema.literal}} property. Now, partitions j+1...k are created with an evolved Avro
schema A'. Does the current implementation of partitions in Tajo support storing such properties
within the partitions? In any event, if this might an issue, we can create a separate ticket
for this work.

> Add Avro storage support
> ------------------------
>                 Key: TAJO-711
>                 URL: https://issues.apache.org/jira/browse/TAJO-711
>             Project: Tajo
>          Issue Type: New Feature
>            Reporter: David Chen
>            Assignee: David Chen
>         Attachments: TAJO-711.patch
> Add {{FileScanner}} and {{FileAppender}} for reading from and writing to Avro.

This message was sent by Atlassian JIRA

View raw message