streams-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Ebanks <>
Subject Re: Source and Resource generation from jsonschemas
Date Mon, 25 Apr 2016 16:29:55 GMT
I think being able to generate case classes from json schema is valuable.
However there are already projects that attempt to do this.  See this stack
overflow question/answer.

What will streams do that will be better/different than these projects?

On Thu, Apr 21, 2016 at 12:13 PM, Steve Blackmon <>

> tl;dr We should build a suite of maven-plugins to generate new categories
> of source and resource artifacts. for starters we need our own jsonschema
> to java pojo plugin
> For a while I’ve been working on stories to add the ability to generate
> new types of sources and resources from jsonschemas, including the activity
> streams schemas maintained by the project.
>    1. [image: New Feature] STREAMS-389
>    Support generation of scala source from jsonschemas
>    <>
>    1. [image: New Feature] STREAMS-398
>    Support generation of hive table definitions from jsonschema
>    <>
> I've gotten pretty deep into this and believe strongly at this point that
> diversifying the type of artifacts our project can generate off schemas
> will add a powerful and valuable set of use cases.  There’s a lot of
> working being done in spark and flink to enable, simplify, and optimize
> working with data when quality POJOs and scala case classes are available
> on the class path.
> There are a series of other popular big data technologies where having an
> explicit definition of object structure makes working with data easier
> (hadoop, pig, elasticsearch, kafka, just to name a few).  Making it simple
> to generate those artifacts using CLIs or maven plugins off in-house
> schemas, mixing in schemas from streams providers and processors, or linked
> externally on the web could be the killer app streams has been missing.
> To really pursue this it makes sense that we would build up core utilities
> for resolving and managing the object types defined and referenced across
> groups of schemas and external dependencies.  To date we've relied entirely
> on org.jsonschema:jsonschema2pojo and
> org:jsonschema:jsonschema2pojo-maven-plugin to handle this conversion of
> schemas to POJOs.  I think we need to bring that core capability in-house
> to have full control of it’s behavior and output.
> Questions for the list:
> Does this challenge resonate with you / your organization?
> Do you have any concern about shifting project attention toward plugins
> and tools for data definition?
> Are you comfortable / uncomfortable with seeing the core streams POJOs
> used throughout our providers and processors change as part of this effort?
> Steve Blackmon

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message