Return-Path: X-Original-To: apmail-crunch-dev-archive@www.apache.org Delivered-To: apmail-crunch-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0CFAD115D9 for ; Sun, 6 Jul 2014 19:29:34 +0000 (UTC) Received: (qmail 38403 invoked by uid 500); 6 Jul 2014 19:29:33 -0000 Delivered-To: apmail-crunch-dev-archive@crunch.apache.org Received: (qmail 38360 invoked by uid 500); 6 Jul 2014 19:29:33 -0000 Mailing-List: contact dev-help@crunch.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@crunch.apache.org Delivered-To: mailing list dev@crunch.apache.org Received: (qmail 38345 invoked by uid 500); 6 Jul 2014 19:29:33 -0000 Delivered-To: apmail-incubator-crunch-dev@incubator.apache.org Received: (qmail 38342 invoked by uid 99); 6 Jul 2014 19:29:33 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 06 Jul 2014 19:29:33 +0000 Date: Sun, 6 Jul 2014 19:29:33 +0000 (UTC) From: "Gabriel Reid (JIRA)" To: crunch-dev@incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CRUNCH-433) Add support for reading specific/reflect data from an Avro MR file MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CRUNCH-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14053202#comment-14053202 ] Gabriel Reid commented on CRUNCH-433: ------------------------------------- {quote}+1 for the corrected patch, with one request: that BaseAvroTableType be package-scoped instead of public if at all possible.{quote} Sounds like a good plan. The reason it's public is to use it specifically in AvroTableFileSource, but I think it's easy enough to get around that. {quote}do we need to add a classifier line to the avro-mapred dependencies in the POM for this stuff to work properly on MR1 vs. MR2?{quote} I don't think so, but I'm not sure I'm totally following what you mean. The only new thing being done here from avro-mapred is making use of the org.apache.avro.hadoop.io.AvroKeyValue class (basically only for schema creation), so I don't think there's anything that would change there in terms of needing classifiers (or am I missing something?) > Add support for reading specific/reflect data from an Avro MR file > ------------------------------------------------------------------ > > Key: CRUNCH-433 > URL: https://issues.apache.org/jira/browse/CRUNCH-433 > Project: Crunch > Issue Type: New Feature > Reporter: Gabriel Reid > Assignee: Gabriel Reid > Attachments: CRUNCH-433.patch > > > An Avro Key/Value file written via raw MapReduce contains records that follow the schema generated by the org.apache.avro.hadoop.io.AvroKeyValue class. > If these files contain specific or reflection-based records, there is currently no easy way to read them in as specific or reflection records. Using the basic public Crunch APIs, they can only be read as generic records (that also contain generic records). > A method should be added to the Avros class which allows specifying specific PTypes to be used for reading the underlying data types within a raw MR output file. > Link to related discussion that inspired this ticket on the user list: http://s.apache.org/es -- This message was sent by Atlassian JIRA (v6.2#6252)