crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Micah Whitacre (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CRUNCH-370) Update Parquet dependency in Crunch pom
Date Wed, 26 Mar 2014 03:11:17 GMT

     [ https://issues.apache.org/jira/browse/CRUNCH-370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Micah Whitacre updated CRUNCH-370:
----------------------------------

    Attachment: CRUNCH-370.patch

Patch upgrading master to 1.3.2.  The build passed successfully.  I haven't dug in though
to see if Parquet made any non-passive changes between versions that would make this upgrade
not desirable.  [~tomwhite], since you wrote some of the original Parquet support do you have
any objections or concerns with making this upgrade?

> Update Parquet dependency in Crunch pom
> ---------------------------------------
>
>                 Key: CRUNCH-370
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-370
>             Project: Crunch
>          Issue Type: Improvement
>          Components: IO
>    Affects Versions: 0.9.0
>            Reporter: Anandsagar Kothapalli
>            Assignee: Micah Whitacre
>         Attachments: CRUNCH-370.patch
>
>
> Currently crunch is supporting avro to parquet conversion using AvroParquetFileTarget,
AvroParquetFileSource classes. When I used these classes to convert avro to parquet files,
I got the following exception in some cases: "org.apache.crunch.CrunchRuntimeException: parquet.io.ParquetEncodingException:
empty fields are illegal, the field should be ommited completely instead"
> After further debugging I found out that this issue is related to AvroWriteSupport class
in parquet, which was fixed as part of milestone 1.2.3 https://github.com/Parquet/parquet-mr/issues/162.
Latest parquet version is 1.3.2.
> But crunch is still using parquet 1.2.0 https://github.com/apache/crunch/blob/master/pom.xml#L77

> As part of this improvement, parquet dependency version in crunch will be updated if
not to latest then at least to 1.2.3 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message