crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tom White (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CRUNCH-277) Support Parquet
Date Tue, 08 Oct 2013 11:24:42 GMT

     [ https://issues.apache.org/jira/browse/CRUNCH-277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tom White updated CRUNCH-277:
-----------------------------

    Attachment: CRUNCH-277.patch

Thanks for reviewing the patch!

I found that the Parquet source is not compatible with CombineFileInputFormat, since the ParquetRecordReader
expects a ParquetInputSplit (which encodes Parquet block information), rather than a regular
FileSplit. To fix this I've disabled the use of combine files for the Parquet source, and
added a new test to verify it works.

> Support Parquet
> ---------------
>
>                 Key: CRUNCH-277
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-277
>             Project: Crunch
>          Issue Type: New Feature
>          Components: IO
>            Reporter: Tom White
>            Assignee: Tom White
>         Attachments: CRUNCH-277.patch, CRUNCH-277.patch
>
>
> Add a source and target for Parquet files.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message