metamodel-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kasper Sørensen <i.am.kasper.soren...@gmail.com>
Subject Re: Checking compliance to a schema with metamodel
Date Mon, 13 Jun 2016 04:04:38 GMT
Hi Juan,

I've been reading your mail a few times and going backwards and
forwards in my mind about what I think about it :-) I like the idea
that you describe, but when trying to apply it to MetaModel I am
struggling to find the right way of generalizing the issue so that it
makes sense within the goal of MetaModel: To uniform the way you
interact with many different types of datastores. And to some extent I
totally see the validity of your idea in MetaModel because we have a
whole range of supported NoSQL / schemaless datastores such as JSON,
MongoDB, CouchDB, Cassandra, HBase and so on and so on. So being able
to ask MetaModel whether a particular "schemaless record" would fit in
with another table (either just an "inferred" or a constrained schema)
would be a valuable thing to do.

It's certainly not something we can offer today though. In fact we
generalize the schemaless structures such that they appear like
regular constrained records. Probably we would need to at least have a
set of interfaces for schemaless tables and rows so that we can make
further interrogation about these when validating them for fitness
into another table.

Kasper

2016-06-07 8:58 GMT-07:00 Juan Rodríguez Hortalá
<juan.rodriguez.hortala@gmail.com>:
> It would be something in the lines of:
>
> DataContext hiveDataContext = DataContextFactory.createJDBCDataContext(...)
> Table expectedJsonSchema =
> hiveDataContext.getDefaultSchema().getTableByName("employessTable")
> DataContext jsonDataContext = new JsonDataContext(expectedJsonSchema)
> for (String jsonRecord : jsonRecords) {
>    if (!jsonDataContext.validSchema(jsonRecord )) {
>      System.out.println("Invalid record " +  jsonRecord );
>    }
> }
>
> You can approximate DataContext.validSchema for JsonDataContext by creating
> a InMemoryResource for the JSON string and them comparing the inferred
> schema against the expected schema, but that creates a new Jackson
> JsonParser per JSON record, and I think it only infers the schema for the
> first level of nesting in JSON, as JsonDocumentSource.readValue uses
> _parser.readValueAs(Map.class).
>
> The idea is that when you have schema on read then you could be interested
> in checking which records fit the schema and which don't, and collect the
> failures. A more useful method would
>
> class JsonDataContext  .... {
>     public Stream<Either<Row, String>> validateRecords(Stream<String>
> records)
> }
>
> for
> http://www.functionaljava.org/javadoc/4.4/functionaljava/fj/data/Either.html,
> that would split a stream of JSON records into records with the suitable
> schema, and faulty records. This should probably be generalized to
> distinguish parsing failures from schema compliance failures. But this
> would have to be generalized to work for the whole DataContext API: for
> example Stream<String> records makes sense for JsonDataContext, but not for
> other DataContexts. Also it would be nice being able to convert from Row to
> the specific format of each DataContext, for example from Row to a JSON
> string for JsonDataContext, this way you could use metamodel to convert
> between serialization formats. Also, this starts getting in the field of
> streaming SQL like in https://calcite.apache.org/docs/stream.html.
>
> These are just some ideas.
>
> Thanks again for taking the time to answer my questions.
>
> Greetings,
>
> Juan
>
>
>
>
> On Mon, Jun 6, 2016 at 8:20 PM, Kasper Sørensen <
> i.am.kasper.sorensen@gmail.com> wrote:
>
>> Or else I'm not really understanding your question at least :) Would
>> be interested if it is something we _could_ offer from MetaModel side
>> that just isn't there yet. What do you have in mind in terms of
>> pseudo/wish code?
>>
>> 2016-06-06 20:17 GMT-07:00 Juan Rodríguez Hortalá
>> <juan.rodriguez.hortala@gmail.com>:
>> > Hi Kasper, thanks for your answer. I understand I could use those tools
>> to
>> > validate a JSON object against an expected schema expressed as a JSON
>> > schema, or as a mapping to a java POJO. You can use metamodel to specify
>> > JSON transformation as SQL queries, the idea I had was using metamodel to
>> > specify JSON validations as SQL table schemas. But using JSON schema
>> looks
>> > like the simplest solution here.
>> >
>> > Thanks again for your help.
>> >
>> > Greetings,
>> >
>> > Juan
>> >
>> > On Sat, Jun 4, 2016 at 9:10 AM, Kasper Sørensen <
>> > i.am.kasper.sorensen@gmail.com> wrote:
>> >
>> >> I'm not sure MetaModel is the right tool for the job in your case,
>> >> Juan. I might be wrong and not seeing the light here.
>> >> In cases where I've needed to do JSON validation, I've used (and can
>> >> thus far recommend using) Jackson and Hibernate Validator.
>> >>
>> >> 2016-06-02 12:37 GMT-07:00 Juan Rodríguez Hortalá
>> >> <juan.rodriguez.hortala@gmail.com>:
>> >> > Hi,
>> >> >
>> >> > I have to check that some JSON objects have a certain shape. I was
>> >> > considering specifying the schema as a Table object, and then using
>> >> > JsonDataContext for implementing this check. The idea is defining a
>> >> method
>> >> > that given a Table and a String for a JSON object, returns a boolean
>> >> saying
>> >> > whether the JSON is complaint with the schema or not. Can that be
>> easily
>> >> > implemented with metamodel?
>> >> >
>> >> > Thanks in advance.
>> >> >
>> >> > Greetings,
>> >> >
>> >> > Juan
>> >>
>>

Mime
View raw message