Return-Path: X-Original-To: apmail-avro-user-archive@www.apache.org Delivered-To: apmail-avro-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 86DB36284 for ; Fri, 15 Jul 2011 11:55:59 +0000 (UTC) Received: (qmail 86123 invoked by uid 500); 15 Jul 2011 11:55:59 -0000 Delivered-To: apmail-avro-user-archive@avro.apache.org Received: (qmail 85840 invoked by uid 500); 15 Jul 2011 11:55:55 -0000 Mailing-List: contact user-help@avro.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@avro.apache.org Delivered-To: mailing list user@avro.apache.org Received: (qmail 85830 invoked by uid 99); 15 Jul 2011 11:55:53 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Jul 2011 11:55:53 +0000 X-ASF-Spam-Status: No, hits=1.6 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of opus111@gmail.com designates 209.85.216.43 as permitted sender) Received: from [209.85.216.43] (HELO mail-qw0-f43.google.com) (209.85.216.43) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Jul 2011 11:55:44 +0000 Received: by qwf6 with SMTP id 6so725937qwf.30 for ; Fri, 15 Jul 2011 04:55:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type; bh=oIZ0cJxxeNthiaBveJcy+IOC060QcJFrDLYAmOw9hQM=; b=SWcvx499gB/EevV5wvYyB7ItIlnSijhQY8Ws0uOcifb3CCeeaLfd9IipXfE6/co1gt 0PR8Hf81/plgk5bIL007EzllpoCfXT2oYeTARclpfVFrLuQpZs6qRRsHFasqVd947se7 jCDfvYsLeK0uAxu9mhBv8MGbRxU7c1Vu59kO4= Received: by 10.224.210.8 with SMTP id gi8mr3063952qab.292.1310730923757; Fri, 15 Jul 2011 04:55:23 -0700 (PDT) Received: from hemiola-2.local (pool-96-237-228-254.bstnma.fios.verizon.net [96.237.228.254]) by mx.google.com with ESMTPS id q12sm803460qca.21.2011.07.15.04.55.21 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 15 Jul 2011 04:55:22 -0700 (PDT) Message-ID: <4E202A69.60309@gmail.com> Date: Fri, 15 Jul 2011 07:54:17 -0400 From: Peter Wolf User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:5.0) Gecko/20110624 Thunderbird/5.0 MIME-Version: 1.0 To: Scott Carey CC: "user@avro.apache.org" Subject: Re: Schema with multiple Record types Java API References: In-Reply-To: Content-Type: multipart/alternative; boundary="------------060309070808040107020804" This is a multi-part message in MIME format. --------------060309070808040107020804 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Thanks again Scott, Yes, I am using AVRO to serialize existing Java classes, so tools to generate code will not help me. Are there tools that go the other way, such as JAXB for XML? I really want to point to a root Java object, and say "serialize this, and everything it points to, as AVRO". BTW AVRO Rocks! My objects contain are amounts of data, and I am *very* impressed with the speed of serialization/deserialization. Cheers P On 7/14/11 10:10 PM, Scott Carey wrote: > AvroIDL can handle imports, but it generates classes. The Avro API's > for this can be used to generate Schemas without making objects if you > wish. > > The Avro schema compiler (*.avsc, *.avpr) does not support imports, it > is a feature requested by many but not contributed by anyone. > > You may be interested in the code-gen capabilities of Avro, which has > a Velocity templating engine to create Java classes based on schemas. > This can be customized to generate classes in custom ways. > > However, if you are using Avro to serialize objects that have > pre-existing classes, the Reflect API or an enhancement of it may be > more suitable. > > More information on your use case may help to point you in the right > direction. > > -Scott > > > On 7/14/11 6:43 PM, "Peter Wolf" > wrote: > > Many thanks Scott, > > I am looking for the equivalent of #include or import. I want to > make a complicated schema with many record types, but manage it in > separate strings. > > In my application, I am using AVRO to serialize a tree of > connected Java objects. The record types mirror Java classes. > The schema descriptions live in the different Java classes, and > reference each other. > > My current code looks like this... > > public class Foo { > > static String schemaDescription = > "{" + > " \"namespace\": \"foo\", " + > " \"name\": \"Foo\", " + > " \"type\": \"record\", " + > " \"fields\": [ " + > " {\"name\": \"notes\", \"type\": > \"string\" }, " + > " {\"name\": \"timestamp\", \"type\": > \"string\" }, " + > " {\"name\": \"bah\", \"type\": " + > Bah.schemaDescription + " }," + > " {\"name\": \"zot\", \"type\": " + > Zot.schemaDescription + " }" + > " ]" + > "}"; > > static Schema schema = Schema.parse(schemaDescription); > > > So, I am referencing by copying the schemaDescriptions. The top > level schemaDescription strings therefore get really big. > > Is there already a clean coding Pattern for doing this-- I can't > be the first. Is there a document describing best practices? > > Thanks > P > > > > > > On 7/14/11 7:02 PM, Scott Carey wrote: >> The name and namespace is part of any named schema (Type.RECORD, >> Type.FIXED, Type.ENUM). >> >> We don't currently have an API to search a schema for subschemas >> that match names. It would be useful, you might want to create a >> JIRA ticket explaining your use case. >> >> So it would be a little more complex. >> >> Schema schema = Schema.parse(schemaDescription); >> Schema.Type type = schema.getType(); >> switch (type) { >> case RECORD: >> String name = schema.getName(); >> String namespace = schema.getNamespace(); >> List fields = schema.getFields(); >> } >> etc. >> >> In general, I have created SpecificRecord objects from schemas >> using the specific compiler (and the ant task or maven plugin) >> and then within those generated classes there is a static SCHEMA >> variable to reference. >> >> Avro IDL is alo an easier way to define related schemas. >> Currently there are only build tools that generate code from >> these, though there are APIs to extract schemas. >> >> -Scott >> >> On 7/13/11 10:43 AM, "Peter Wolf" > > wrote: >> >> Hello, this a dumb question, but I can not find the answer in >> the docs >> >> I want to have a complicated schema with lots of Records >> referencing other Records. >> >> Like this... >> >> { >> "namespace": "com.foobah", >> "name": "Bah", >> "type": "record", >> "fields": [ >> {"name": "value", "type": "int"} >> ] >> } >> >> { >> "namespace": "com.foobah", >> "name": "Foo", >> "type": "record", >> "fields": [ >> {"name": "bah", "type": "Bah"} >> ] >> } >> >> Using the Java API, how do I reference types within a >> schema? Let's say I want to make a Foo object, I want to do >> something like this... >> >> Schema schema = Schema.parse(schemaDescription); >> >>> Schema foo = schema.getSchema("com.foobah.Foo"); <<< >> GenericData o = new GenericData( foo ); >> >> Many thanks in advance >> Peter >> >> >> > --------------060309070808040107020804 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Thanks again Scott,

Yes, I am using AVRO to serialize existing Java classes, so tools to generate code will not help me.

Are there tools that go the other way, such as JAXB for XML?  I really want to point to a root Java object, and say "serialize this, and everything it points to, as AVRO".

BTW AVRO Rocks!  My objects contain are amounts of data, and I am *very* impressed with the speed of serialization/deserialization.

Cheers
P





On 7/14/11 10:10 PM, Scott Carey wrote:
AvroIDL can handle imports, but it generates classes.  The Avro API's for this can be used to generate Schemas without making objects if you wish.

The Avro schema compiler (*.avsc, *.avpr) does not support imports, it is a feature requested by many but not contributed by anyone.

You may be interested in the code-gen capabilities of Avro, which has a Velocity templating engine to create Java classes based on schemas.  This can be customized to generate classes in custom ways.

However, if you are using Avro to serialize objects that have pre-existing classes, the Reflect API or an enhancement of it may be more suitable.

More information on your use case may help to point you in the right direction.

-Scott


On 7/14/11 6:43 PM, "Peter Wolf" <opus111@gmail.com> wrote:

Many thanks Scott,

I am looking for the equivalent of #include or import.  I want to make a complicated schema with many record types, but manage it in separate strings.

In my application, I am using AVRO to serialize a tree of connected Java objects.  The record types mirror Java classes.  The schema descriptions live in the different Java classes, and reference each other.

My current code looks like this...

    public class Foo {

        static String schemaDescription =
            "{" +
                    "  \"namespace\": \"foo\", " +
                    "  \"name\": \"Foo\", " +
                    "  \"type\": \"record\", " +
                    "  \"fields\": [ " +
                    "      {\"name\": \"notes\", \"type\": \"string\" }, " +
                    "      {\"name\": \"timestamp\", \"type\": \"string\" }, " +
                    "      {\"name\": \"bah\", \"type\": " + Bah.schemaDescription + " }," +
                    "      {\"name\": \"zot\", \"type\": " + Zot.schemaDescription + " }" +
                    "    ]" +
                    "}";

        static Schema schema = Schema.parse(schemaDescription);

 
So, I am referencing by copying the schemaDescriptions.  The top level schemaDescription strings therefore get really big.

Is there already a clean coding Pattern for doing this-- I can't be the first.  Is there a document describing best practices?

Thanks
P





On 7/14/11 7:02 PM, Scott Carey wrote:
The name and namespace is part of any named schema (Type.RECORD, Type.FIXED, Type.ENUM).

We don't currently have an API to search a schema for subschemas that match names.  It would be useful, you might want to create a JIRA ticket explaining your use case.

So it would be a little more complex.

        Schema schema = Schema.parse(schemaDescription);
        Schema.Type type = schema.getType();
        switch (type) {
        case RECORD:
          String name = schema.getName();
          String namespace = schema.getNamespace();
          List<Field> fields = schema.getFields();
        }
        
        etc.

In general, I have created SpecificRecord objects from schemas using the specific compiler (and the ant task or maven plugin) and then within those generated classes there is a static SCHEMA variable to reference.

Avro IDL is alo an easier way to define related schemas.  Currently there are only build tools that generate code from these, though there are APIs to extract schemas.

-Scott

On 7/13/11 10:43 AM, "Peter Wolf" <opus111@gmail.com> wrote:

Hello, this a dumb question, but I can not find the answer in the docs

I want to have a complicated schema with lots of Records referencing other Records.

Like this...

{
  "namespace": "com.foobah",
  "name": "Bah",
  "type": "record",
  "fields": [
  {"name": "value", "type": "int"}
  ]
}

{
  "namespace": "com.foobah",
  "name": "Foo",
  "type": "record",
  "fields": [
  {"name": "bah", "type": "Bah"}
  ]
}
Using the Java API, how do I reference types within a schema?  Let's say I want to make a Foo object, I want to do something like this...

        Schema schema = Schema.parse(schemaDescription);
>>> Schema foo = schema.getSchema("com.foobah.Foo"); <<<
        GenericData o = new GenericData( foo );

Many thanks in advance
Peter





--------------060309070808040107020804--