kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edvard Poliakov (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (KAFKA-6002) Kafka Connect Transform transforming JSON string into actual object
Date Mon, 02 Oct 2017 18:09:00 GMT

     [ https://issues.apache.org/jira/browse/KAFKA-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Edvard Poliakov updated KAFKA-6002:
-----------------------------------
    Description: 
My colleague and I have been working on a new Transform, that takes a JSON string and transforms
it into an actual object, like this:

{code} 
{
  "a" : "{\"b\": 23}"
}
{code}
into
{code}
{
  "a" : {
       "b" : 23
  }
}
{code}

There is no robust way of building a Schema from a JSON object itself, as it can be something
like an empty array or a null, that doesn't provide any info on the schema of the object.
So I see two options here.

1. For a transform to take in schema as a transform parameter. The problem I found with this
is that it is not clear what JSON schema specification should be used for this? I assume it
would be reasonable to use http://json-schema.org/, but it doesn't seem that Kafka Connect
supports it currently, moreover reading through JsonConverter class in Kafka Connect, I am
not able to understand what spec does the Json Schema have that is used in that class, for
example {{asConnectSchema}} method on {{JsonConverter}}, [see here|https://github.com/apache/kafka/blob/trunk/connect/json/src/main/java/org/apache/kafka/connect/json/JsonConverter.java#L415].

2. On each object received, keep updating the schema, but I can't see a standard and robust
way of handling edge cases.

I am happy to create a pull request for this transform, if we can agree on something here.
:)

  was:
My colleague and I have been working on a new Transform, that takes a JSON string and transforms
it into an actual object, like this:

{code} 
{
  "a" : "{\"b\": 23}"
}
{code}
into
{code}
{
  "a" : {
       "b" : 23
  }
}
{code}

There is no robust way of building a Schema from a JSON object itself, as it can be something
like an empty array or a null, that doesn't provide any info on the schema of the object.
So I see two options here.

1. For a transform to take in schema as a transform parameter. The problem I found with this
is that it is not clear what JSON schema specification should be used for this? I assume it
would be reasonable to use http://json-schema.org/, but it doesn't seem that Kafka Connect
supports it currently, moreover reading through JsonConverter class in Kafka Connect, I am
not able to understand what spec does the Json Schema have that is used in that class, for
example {{asConnectSchema}} method on {{JsonConverter}}, .

2. On each object received, keep updating the schema, but I can't see a standard and robust
way of handling edge cases.

I am happy to create a pull request for this transform, if we can agree on something here.
:)


> Kafka Connect Transform transforming JSON string into actual object
> -------------------------------------------------------------------
>
>                 Key: KAFKA-6002
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6002
>             Project: Kafka
>          Issue Type: Improvement
>          Components: KafkaConnect
>            Reporter: Edvard Poliakov
>            Priority: Minor
>
> My colleague and I have been working on a new Transform, that takes a JSON string and
transforms it into an actual object, like this:
> {code} 
> {
>   "a" : "{\"b\": 23}"
> }
> {code}
> into
> {code}
> {
>   "a" : {
>        "b" : 23
>   }
> }
> {code}
> There is no robust way of building a Schema from a JSON object itself, as it can be something
like an empty array or a null, that doesn't provide any info on the schema of the object.
So I see two options here.
> 1. For a transform to take in schema as a transform parameter. The problem I found with
this is that it is not clear what JSON schema specification should be used for this? I assume
it would be reasonable to use http://json-schema.org/, but it doesn't seem that Kafka Connect
supports it currently, moreover reading through JsonConverter class in Kafka Connect, I am
not able to understand what spec does the Json Schema have that is used in that class, for
example {{asConnectSchema}} method on {{JsonConverter}}, [see here|https://github.com/apache/kafka/blob/trunk/connect/json/src/main/java/org/apache/kafka/connect/json/JsonConverter.java#L415].
> 2. On each object received, keep updating the schema, but I can't see a standard and
robust way of handling edge cases.
> I am happy to create a pull request for this transform, if we can agree on something
here. :)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message