avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tl ...@rat.io>
Subject Re: Json.ObjectWriter - "Not the Json schema"
Date Wed, 24 Feb 2016 19:03:38 GMT

> On 24.02.2016, at 19:08, Prajwal Tuladhar <praj@infynyxx.com> wrote:
> 
> Can you paste your avro IDL schema?

I had already crafted a more elaborate version of this question for stackoverflow and was
just about to post it when you answered my request. Maybe it’s still helpful so I post it
here. Thanks for taking the time!



I'm writing a tool to convert data from a homegrown format to Avro and JSON (and later Parquet),
using Avro 1.8.0. Conversion to Avro is working okay, but JSON conversion throws the following
error:

    Exception in thread "main" java.lang.RuntimeException: Not the Json schema:
    {"type":"record","name":"Torperf","namespace":"converTor.torperf",
    "fields":[{"name":"descriptor_type","type":"string"," 
    [... rest of the schema omitted for brevity]

Irritatingly this is the schema that I passed along and which indeed I want the converter
to use. I have no idea what Avro is complaining about. 
This is the relevant snippet of my code:

    //  parse the schema file
    Schema.Parser parser = new Schema.Parser();
    Schema mySchema;
    //  tried two ways to load the schema
    //  like this
    File schemaFile = new File("myJsonSchema.avsc");
    mySchema = parser.parse(schemaFile) ;
    //  and also like Json.class loads it's schema
    mySchema = parser.parse(Json.class.getResourceAsStream("myJsonSchema.avsc"));
    
    //  initialize the writer
    Json.ObjectWriter jsonDatumWriter = new Json.ObjectWriter();
    jsonDatumWriter.setSchema(mySchema);
    OutputStream out = new FileOutputStream(new File("output.avro"));
    Encoder encoder = EncoderFactory.get().jsonEncoder(mySchema, out);

    //  append a record created by way of a specific mapping
    jsonDatumWriter.write(specificRecord, encoder);

I tried some stuff like replacing my schema with the one returned from the exception, but
to no avail (and apart from whitespace and linefeeds they are the same).
   
   
From staring at org.apache.avro.data.Json it seems to me like Avro is checking my record schema
against it's own schema of a Json record (line 171) for equality (line 222). 

    36  public static final Schema SCHEMA;

    171 SCHEMA = Schema.parse(Json.class.getResourceAsStream("/org/apache/avro/data/Json.avsc"));

    221 public void setSchema(Schema schema) {
    222   if(!Json.SCHEMA.equals(schema)) {
    223     throw new RuntimeException("Not the Json schema: " + schema);
    224   }
    225 }

The referenced Json.avsc is short:

    {"type": "record", "name": "Json", "namespace":"org.apache.avro.data",
     "fields": [
         {"name": "value",
          "type": [
              "long",
              "double",
              "string",
              "boolean",
              "null",
              {"type": "array", "items": "Json"},
              {"type": "map", "values": "Json"}
          ]
         }
     ]
    }

`equals` is implemented in org.apache.avro.Schema.class, line 256:

      public boolean equals(Object o) {
        if(o == this) {
          return true;
        } else if(!(o instanceof Schema)) {
          return false;
        } else {
          Schema that = (Schema)o;
          return this.type != that.type?false:this.equalCachedHash(that) && this.props.equals(that.props);
        }
      }

I don't fully understand what's going on in the third check (especially  equalCachedHash())
but I only recognize checks for equality in a trivial sense and obviously my schema wouldn't
model the JSON data model.

Also I can't find any examples or notes about usage of Avro's Json.ObjectWriter on the InterWebs.
I wonder if I should go with the deprecated Json.Writer instead. Not that it's much better
documented but there are at least a few code snippets online to learn and glean from.

The full source is available at https://github.com/tomlurge/converTor












> On Wed, Feb 24, 2016 at 7:46 AM, tl <tl@rat.io> wrote:
> Hi again,
> 
> I still haven’t found a solution to this problem. Does this look like some beginners
Java mistake (because that may well be…)? Is it okay to ask the same question on stackoverflow
or would that count as crossposting/spamming?
> 
> Cheers,
> Thomas
> 
> 
> > On 23.02.2016, at 02:22, tl <tl@rat.io> wrote:
> >
> > Hi,
> >
> >
> > I want to convert incoming data to Avro and JSON (and later Parquet). Avro conversion
is working okay, but JSON conversion throws the following error that I don’t understand:
> >
> > Exception in thread "main" java.lang.RuntimeException: Not the Json schema: {"type":"record","name":"Torperf","namespace":"converTor.torperf","fields":[{"name":"descriptor_type","type":"string","default":"torperf
1.0"},
> > [ … omitted for brevity …]
> > {"name":"circ_id","type":["null","int"],"doc":"metrics-lib/TorperfResult: int getCircId()"},{"name":"used_by","type":["null","int"],"doc":"metrics-lib/TorperfResult:
int getUsedBy()"}],"aliases":["torperfResult"]}
> >       at org.apache.avro.data.Json$ObjectWriter.setSchema(Json.java:117)
> >       at converTor.WriterObject.<init>(WriterObject.java:116)
> >       at converTor.TypeWriter.get(TypeWriter.java:31)
> >       at converTor.ConverTor.main(ConverTor.java:249)
> >       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >       at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> >       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >       at java.lang.reflect.Method.invoke(Method.java:606)
> >       at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)
> >
> > … but that schema is indeed the schema that I want to use.
> >
> >
> > This is a snippet of my code:
> >
> > File schemaFile = new File("schema/jsonSchema.avsc");
> > Schema.Parser parser = new Schema.Parser();
> > Schema mySchema = parser.parse(schemaFile) ;
> >
> > Json.ObjectWriter jsonDatumWriter = new Json.ObjectWriter();
> > jsonDatumWriter.setSchema(mySchema);
> > OutputStream out = new FileOutputStream(outputFile);
> > Encoder encoder = EncoderFactory.get().jsonEncoder(mySchema, out);
> >
> >
> > Can somebody give me a hint?
> >
> >
> > Thanks,
> > Thomas
> 
> 
> 
> 
> 
> 
> 
> -- 
> --
> Cheers,
> Praj





< he not busy being born is busy dying >






Mime
View raw message