Mailing-List: contact user-help@avro.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@avro.apache.org
Received-SPF: pass (athena.apache.org: domain of opus111@gmail.com designates
 209.85.216.43 as permitted sender)
Message-ID: <4E202A69.60309@gmail.com>
Date: Fri, 15 Jul 2011 07:54:17 -0400
From: Peter Wolf <opus111@gmail.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6;
 rv:5.0) Gecko/20110624 Thunderbird/5.0
MIME-Version: 1.0
To: Scott Carey <scott@richrelevance.com>
CC: "user@avro.apache.org" <user@avro.apache.org>
Subject: Re: Schema with multiple Record types Java API
References: <CA44EE3C.45142%scott@richrelevance.com>
In-Reply-To: <CA44EE3C.45142%scott@richrelevance.com>
Content-Type: multipart/alternative;
 boundary="------------060309070808040107020804"

This is a multi-part message in MIME format.
--------------060309070808040107020804
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Thanks again Scott,

Yes, I am using AVRO to serialize existing Java classes, so tools to 
generate code will not help me.

Are there tools that go the other way, such as JAXB for XML?  I really 
want to point to a root Java object, and say "serialize this, and 
everything it points to, as AVRO".

BTW AVRO Rocks!  My objects contain are amounts of data, and I am *very* 
impressed with the speed of serialization/deserialization.

Cheers
P


On 7/14/11 10:10 PM, Scott Carey wrote:
> AvroIDL can handle imports, but it generates classes.  The Avro API's 
> for this can be used to generate Schemas without making objects if you 
> wish.
>
> The Avro schema compiler (*.avsc, *.avpr) does not support imports, it 
> is a feature requested by many but not contributed by anyone.
>
> You may be interested in the code-gen capabilities of Avro, which has 
> a Velocity templating engine to create Java classes based on schemas. 
>  This can be customized to generate classes in custom ways.
>
> However, if you are using Avro to serialize objects that have 
> pre-existing classes, the Reflect API or an enhancement of it may be 
> more suitable.
>
> More information on your use case may help to point you in the right 
> direction.
>
> -Scott
>
>
> On 7/14/11 6:43 PM, "Peter Wolf" <opus111@gmail.com 
> <mailto:opus111@gmail.com>> wrote:
>
>     Many thanks Scott,
>
>     I am looking for the equivalent of #include or import.  I want to
>     make a complicated schema with many record types, but manage it in
>     separate strings.
>
>     In my application, I am using AVRO to serialize a tree of
>     connected Java objects.  The record types mirror Java classes. 
>     The schema descriptions live in the different Java classes, and
>     reference each other.
>
>     My current code looks like this...
>
>         public class Foo {
>
>             static String schemaDescription =
>                 "{" +
>                         "  \"namespace\": \"foo\", " +
>                         "  \"name\": \"Foo\", " +
>                         "  \"type\": \"record\", " +
>                         "  \"fields\": [ " +
>                         "      {\"name\": \"notes\", \"type\":
>     \"string\" }, " +
>                         "      {\"name\": \"timestamp\", \"type\":
>     \"string\" }, " +
>                         "      {\"name\": \"bah\", \"type\": " +
>     Bah.schemaDescription + " }," +
>                         "      {\"name\": \"zot\", \"type\": " +
>     Zot.schemaDescription + " }" +
>                         "    ]" +
>                         "}";
>
>             static Schema schema = Schema.parse(schemaDescription);
>
>
>     So, I am referencing by copying the schemaDescriptions.  The top
>     level schemaDescription strings therefore get really big.
>
>     Is there already a clean coding Pattern for doing this-- I can't
>     be the first.  Is there a document describing best practices?
>
>     Thanks
>     P
>
>
>
>
>
>     On 7/14/11 7:02 PM, Scott Carey wrote:
>>     The name and namespace is part of any named schema (Type.RECORD,
>>     Type.FIXED, Type.ENUM).
>>
>>     We don't currently have an API to search a schema for subschemas
>>     that match names.  It would be useful, you might want to create a
>>     JIRA ticket explaining your use case.
>>
>>     So it would be a little more complex.
>>
>>             Schema schema = Schema.parse(schemaDescription);
>>             Schema.Type type = schema.getType();
>>             switch (type) {
>>             case RECORD:
>>               String name = schema.getName();
>>               String namespace = schema.getNamespace();
>>               List<Field> fields = schema.getFields();
>>             }
>>             etc.
>>
>>     In general, I have created SpecificRecord objects from schemas
>>     using the specific compiler (and the ant task or maven plugin)
>>     and then within those generated classes there is a static SCHEMA
>>     variable to reference.
>>
>>     Avro IDL is alo an easier way to define related schemas.
>>      Currently there are only build tools that generate code from
>>     these, though there are APIs to extract schemas.
>>
>>     -Scott
>>
>>     On 7/13/11 10:43 AM, "Peter Wolf" <opus111@gmail.com
>>     <mailto:opus111@gmail.com>> wrote:
>>
>>         Hello, this a dumb question, but I can not find the answer in
>>         the docs
>>
>>         I want to have a complicated schema with lots of Records
>>         referencing other Records.
>>
>>         Like this...
>>
>>             {
>>               "namespace": "com.foobah",
>>               "name": "Bah",
>>               "type": "record",
>>               "fields": [
>>               {"name": "value", "type": "int"}
>>               ]
>>             }
>>
>>             {
>>               "namespace": "com.foobah",
>>               "name": "Foo",
>>               "type": "record",
>>               "fields": [
>>               {"name": "bah", "type": "Bah"}
>>               ]
>>             }
>>
>>         Using the Java API, how do I reference types within a
>>         schema?  Let's say I want to make a Foo object, I want to do
>>         something like this...
>>
>>                 Schema schema = Schema.parse(schemaDescription);
>>         >>> Schema foo = schema.getSchema("com.foobah.Foo"); <<<
>>                 GenericData o = new GenericData( foo );
>>
>>         Many thanks in advance
>>         Peter
>>
>>
>>
>


--------------060309070808040107020804
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    Thanks again Scott,<br>
    <br>
    Yes, I am using AVRO to serialize existing Java classes, so tools to
    generate code will not help me.<br>
    <br>
    Are there tools that go the other way, such as JAXB for XML?&nbsp; I
    really want to point to a root Java object, and say "serialize this,
    and everything it points to, as AVRO".<br>
    <br>
    BTW AVRO Rocks!&nbsp; My objects contain are amounts of data, and I am
    *very* impressed with the speed of serialization/deserialization.<br>
    <br>
    Cheers<br>
    P<br>
    <br>
    <br>
    <br>
    <br>
    <br>
    On 7/14/11 10:10 PM, Scott Carey wrote:
    <blockquote cite="mid:CA44EE3C.45142%25scott@richrelevance.com"
      type="cite">
      <div>AvroIDL can handle imports, but it generates classes. &nbsp;The
        Avro API's for this can be used to generate Schemas without
        making objects if you wish.</div>
      <div><br>
      </div>
      <div>The Avro schema compiler (*.avsc, *.avpr) does not support
        imports, it is a feature requested by many but not contributed
        by anyone.</div>
      <div><br>
      </div>
      <div>You may be interested in the code-gen capabilities of Avro,
        which has a Velocity templating engine to create Java classes
        based on schemas. &nbsp;This can be customized to generate classes in
        custom ways.</div>
      <div><br>
      </div>
      <div>However, if you are using Avro to serialize objects that have
        pre-existing classes, the Reflect API or an enhancement of it
        may be more suitable.</div>
      <div><br>
      </div>
      <div>More information on your use case may help to point you in
        the right direction.</div>
      <div><br>
      </div>
      <div>-Scott</div>
      <div><br>
      </div>
      <div><br>
      </div>
      <span id="OLK_SRC_BODY_SECTION">
        <div>
          <div>On 7/14/11 6:43 PM, "Peter Wolf" &lt;<a
              moz-do-not-send="true" href="mailto:opus111@gmail.com">opus111@gmail.com</a>&gt;
            wrote:</div>
        </div>
        <div><br>
        </div>
        <blockquote id="MAC_OUTLOOK_ATTRIBUTION_BLOCKQUOTE"
          style="BORDER-LEFT: #b5c4df 5 solid; PADDING:0 0 0 5; MARGIN:0
          0 0 5;">
          <div>
            <div bgcolor="#FFFFFF" text="#000000"> Many thanks Scott,<br>
              <br>
              I am looking for the equivalent of #include or import.&nbsp; I
              want to make a complicated schema with many record types,
              but manage it in separate strings.<br>
              <br>
              In my application, I am using AVRO to serialize a tree of
              connected Java objects.&nbsp; The record types mirror Java
              classes.&nbsp; The schema descriptions live in the different
              Java classes, and reference each other.<br>
              <br>
              My current code looks like this...<br>
              <br>
              &nbsp;&nbsp;&nbsp; public class Foo {<br>
              <br>
              &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; static String schemaDescription =<br>
              &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "{" +<br>
              &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "&nbsp; \"namespace\": \"foo\", " +<br>
              &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "&nbsp; \"name\": \"Foo\", " +<br>
              &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "&nbsp; \"type\": \"record\", " +<br>
              &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "&nbsp; \"fields\": [ " +<br>
              &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {\"name\": \"notes\", \"type\":
              \"string\" }, " +<br>
              &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {\"name\": \"timestamp\",
              \"type\": \"string\" }, " +<br>
              &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {\"name\": \"bah\", \"type\": "
              + Bah.schemaDescription + " }," +<br>
              &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {\"name\": \"zot\", \"type\": "
              + Zot.schemaDescription + " }" +<br>
              &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "&nbsp;&nbsp;&nbsp; ]" +<br>
              &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "}";<br>
              <br>
              &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; static Schema schema =
              Schema.parse(schemaDescription);<br>
              <br>
              &nbsp;<br>
              So, I am referencing by copying the schemaDescriptions.&nbsp;
              The top level schemaDescription strings therefore get
              really big.<br>
              <br>
              Is there already a clean coding Pattern for doing this-- I
              can't be the first.&nbsp; Is there a document describing best
              practices?<br>
              <br>
              Thanks<br>
              P<br>
              <br>
              <br>
              <br>
              <br>
              <br>
              On 7/14/11 7:02 PM, Scott Carey wrote:
              <blockquote
                cite="mid:CA44C165.45016%25scott@richrelevance.com"
                type="cite">
                <div>The name and namespace is part of any named schema
                  (Type.RECORD, Type.FIXED, Type.ENUM).</div>
                <div><br>
                </div>
                <div>We don't currently have an API to search a schema
                  for subschemas that match names. &nbsp;It would be useful,
                  you might want to create a JIRA ticket explaining your
                  use case.</div>
                <div><br>
                </div>
                <div>So it would be a little more complex.</div>
                <div><br>
                </div>
                <div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;Schema schema =
                  Schema.parse(schemaDescription);</div>
                <div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;Schema.Type type = schema.getType();</div>
                <div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;switch (type) {</div>
                <div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;case RECORD:</div>
                <div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;String name = schema.getName();</div>
                <div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;String namespace = schema.getNamespace();</div>
                <div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;List&lt;Field&gt; fields =
                  schema.getFields();</div>
                <div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;}</div>
                <div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;</div>
                <div>&nbsp;&nbsp; &nbsp; &nbsp; &nbsp;etc.</div>
                <div><br>
                </div>
                <div>In general, I have created SpecificRecord objects
                  from schemas using the specific compiler (and the ant
                  task or maven plugin) and then within those generated
                  classes there is a static SCHEMA variable to
                  reference.</div>
                <div><br>
                </div>
                <div>Avro IDL is alo an easier way to define related
                  schemas. &nbsp;Currently there are only build tools that
                  generate code from these, though there are APIs to
                  extract schemas.</div>
                <div><br>
                </div>
                <div>-Scott</div>
                <div><br>
                </div>
                <span id="OLK_SRC_BODY_SECTION">
                  <div>
                    <div>On 7/13/11 10:43 AM, "Peter Wolf" &lt;<a
                        moz-do-not-send="true"
                        href="mailto:opus111@gmail.com">opus111@gmail.com</a>&gt;

                      wrote:</div>
                  </div>
                  <div><br>
                  </div>
                  <blockquote id="MAC_OUTLOOK_ATTRIBUTION_BLOCKQUOTE"
                    style="BORDER-LEFT: #b5c4df 5 solid; PADDING:0 0 0
                    5; MARGIN:0 0 0 5;">
                    <div>
                      <div bgcolor="#ffffff" text="#000000"> Hello, this
                        a dumb question, but I can not find the answer
                        in the docs<br>
                        <br>
                        I want to have a complicated schema with lots of
                        Records referencing other Records.<br>
                        <br>
                        Like this...<br>
                        <br>
                        <blockquote>{<br>
                          &nbsp; "namespace": "com.foobah",<br>
                          &nbsp; "name": "Bah",<br>
                          &nbsp; "type": "record",<br>
                          &nbsp; "fields": [<br>
                          &nbsp; {"name": "value", "type": "int"}<br>
                          &nbsp; ]<br>
                          }<br>
                          <br>
                          {<br>
                          &nbsp; "namespace": "com.foobah",<br>
                          &nbsp; "name": "Foo",<br>
                          &nbsp; "type": "record",<br>
                          &nbsp; "fields": [<br>
                          &nbsp; {"name": "bah", "type": "Bah"}<br>
                          &nbsp; ]<br>
                          }<br>
                        </blockquote>
                        Using the Java API, how do I reference types
                        within a schema?&nbsp; Let's say I want to make a Foo
                        object, I want to do something like this...<br>
                        <br>
                        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Schema schema =
                        Schema.parse(schemaDescription);<br>
                        &gt;&gt;&gt; Schema foo =
                        schema.getSchema("com.foobah.Foo"); &lt;&lt;&lt;<br>
                        &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; GenericData o = new GenericData( foo );<br>
                        <br>
                        Many thanks in advance<br>
                        Peter<br>
                        <br>
                        <br>
                        <br>
                      </div>
                    </div>
                  </blockquote>
                </span> </blockquote>
              <br>
            </div>
          </div>
        </blockquote>
      </span>
    </blockquote>
    <br>
  </body>
</html>

--------------060309070808040107020804--