metamodel-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Juan Rodríguez Hortalá <juan.rodriguez.hort...@gmail.com>
Subject Re: Checking compliance to a schema with metamodel
Date Tue, 07 Jun 2016 15:58:41 GMT
It would be something in the lines of:

DataContext hiveDataContext = DataContextFactory.createJDBCDataContext(...)
Table expectedJsonSchema =
hiveDataContext.getDefaultSchema().getTableByName("employessTable")
DataContext jsonDataContext = new JsonDataContext(expectedJsonSchema)
for (String jsonRecord : jsonRecords) {
   if (!jsonDataContext.validSchema(jsonRecord )) {
     System.out.println("Invalid record " +  jsonRecord );
   }
}

You can approximate DataContext.validSchema for JsonDataContext by creating
a InMemoryResource for the JSON string and them comparing the inferred
schema against the expected schema, but that creates a new Jackson
JsonParser per JSON record, and I think it only infers the schema for the
first level of nesting in JSON, as JsonDocumentSource.readValue uses
_parser.readValueAs(Map.class).

The idea is that when you have schema on read then you could be interested
in checking which records fit the schema and which don't, and collect the
failures. A more useful method would

class JsonDataContext  .... {
    public Stream<Either<Row, String>> validateRecords(Stream<String>
records)
}

for
http://www.functionaljava.org/javadoc/4.4/functionaljava/fj/data/Either.html,
that would split a stream of JSON records into records with the suitable
schema, and faulty records. This should probably be generalized to
distinguish parsing failures from schema compliance failures. But this
would have to be generalized to work for the whole DataContext API: for
example Stream<String> records makes sense for JsonDataContext, but not for
other DataContexts. Also it would be nice being able to convert from Row to
the specific format of each DataContext, for example from Row to a JSON
string for JsonDataContext, this way you could use metamodel to convert
between serialization formats. Also, this starts getting in the field of
streaming SQL like in https://calcite.apache.org/docs/stream.html.

These are just some ideas.

Thanks again for taking the time to answer my questions.

Greetings,

Juan




On Mon, Jun 6, 2016 at 8:20 PM, Kasper Sørensen <
i.am.kasper.sorensen@gmail.com> wrote:

> Or else I'm not really understanding your question at least :) Would
> be interested if it is something we _could_ offer from MetaModel side
> that just isn't there yet. What do you have in mind in terms of
> pseudo/wish code?
>
> 2016-06-06 20:17 GMT-07:00 Juan Rodríguez Hortalá
> <juan.rodriguez.hortala@gmail.com>:
> > Hi Kasper, thanks for your answer. I understand I could use those tools
> to
> > validate a JSON object against an expected schema expressed as a JSON
> > schema, or as a mapping to a java POJO. You can use metamodel to specify
> > JSON transformation as SQL queries, the idea I had was using metamodel to
> > specify JSON validations as SQL table schemas. But using JSON schema
> looks
> > like the simplest solution here.
> >
> > Thanks again for your help.
> >
> > Greetings,
> >
> > Juan
> >
> > On Sat, Jun 4, 2016 at 9:10 AM, Kasper Sørensen <
> > i.am.kasper.sorensen@gmail.com> wrote:
> >
> >> I'm not sure MetaModel is the right tool for the job in your case,
> >> Juan. I might be wrong and not seeing the light here.
> >> In cases where I've needed to do JSON validation, I've used (and can
> >> thus far recommend using) Jackson and Hibernate Validator.
> >>
> >> 2016-06-02 12:37 GMT-07:00 Juan Rodríguez Hortalá
> >> <juan.rodriguez.hortala@gmail.com>:
> >> > Hi,
> >> >
> >> > I have to check that some JSON objects have a certain shape. I was
> >> > considering specifying the schema as a Table object, and then using
> >> > JsonDataContext for implementing this check. The idea is defining a
> >> method
> >> > that given a Table and a String for a JSON object, returns a boolean
> >> saying
> >> > whether the JSON is complaint with the schema or not. Can that be
> easily
> >> > implemented with metamodel?
> >> >
> >> > Thanks in advance.
> >> >
> >> > Greetings,
> >> >
> >> > Juan
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message