crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Wills (JIRA)" <>
Subject [jira] [Created] (CRUNCH-495) Fix case class/SpecificRecord interactions in Scrunch
Date Fri, 30 Jan 2015 00:33:35 GMT
Josh Wills created CRUNCH-495:

             Summary: Fix case class/SpecificRecord interactions in Scrunch
                 Key: CRUNCH-495
             Project: Crunch
          Issue Type: Bug
          Components: Scrunch
    Affects Versions: 0.11.0
            Reporter: Josh Wills
            Assignee: Josh Wills
             Fix For: 0.12.0

So this is a fun one: I wrote a way to serialize case classes in Scala as Avro generic records
as part of the work for 0.11. However, if AvroMode.SPECIFIC is enabled on a MR job (e.g.,
if you were doing a join between one PTable that contained specific record instances and a
different PTable that contained instances of a case class), the SpecificData object in Avro
will get confused when it sees the Avro schema I generate for the case class, b/c the name
of the Avro schema is identical to the name of the case class on the JVM, so Avro will think
that the record is an actual instance of a SpecificRecord.

The solution I came up with is to slightly modify the name of the generated Avro generic schema
that corresponds to the case class so that it doesn't match the name of the case class exactly
so that Avro doesn't get confused.

This message was sent by Atlassian JIRA

View raw message