avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joseph P." <joseph.pac...@gmail.com>
Subject Re: Specify non-empty array, map, etc.
Date Thu, 11 May 2017 15:35:05 GMT
Hi

You can add prop to your avro schema.

So here we have added our custo props and extra processing before
generating the avro binary to make sure these props are respected.

Pro : very flexible (we have added max_length on string, temporal_format
and so forth...).
Cons : you must be sure to have your extra processing running before
generating the avro binaries

For example in your case you could add a prop "nonEmpty" with default value
to false.

Then, before converting the Avro Json/Pojo to Avro binary, you use your own
SpecificDatumWriter (extending SpecificDatumWriter) and then in writeField
you check for the presence of the prop, its value, and if true you check
for non emptiness.

Cheers


On Wed, May 10, 2017 at 10:41 AM, Tianxiang Xiong <
tianxiang.xiong@fundingcircle.com> wrote:

> Thanks Suraj, but that's not what I mean.
>
> For your second schema, it is possible to pass in an empty array `[]`
> containing no elements. I would like to prevent that.
>
> On 8 May 2017 at 19:32, Suraj Acharya <suraj@apache.org> wrote:
>
>> This is what I have done in my application :
>>
>> {"name": "clients", "type": [ {"type": "array", "items": "Client"}, "null" ]}
>>
>> This allows me to pass null. What you can try is something like this :
>>
>> {"name": "info", "type": { "type": "array", "items": "Information" }
>>
>> In this example, info is something that needs to be passed for every
>> client.
>>
>> Hope that helps.
>>
>>
>> On Fri, May 5, 2017 at 9:51 PM, Tianxiang Xiong <
>> tianxiang.xiong@fundingcircle.com> wrote:
>>
>>> In Avro 1.7.7, is there a way to specify a *non-empty* array, map,
>>> etc.? There doesn't seem to be according to the spec
>>> <https://avro.apache.org/docs/1.7.7/spec.html#Maps>.
>>>
>>> There are applications in which we mandate that a data format has a
>>> non-empty array. It'd be nice if that could be expressed in the schema so
>>> data with nonempty arrays fail to serialize (and are thus never put on a
>>> Kafka topic). Fail earlier > fail later.
>>>
>>> Thanks,
>>>
>>> TX
>>>
>>
>>
>
>
> --
>
> *Tianxiang Xiong*
>
> *tianxiang.xiong@fundingcircle.com <tianxiang.xiong@fundingcircle.com>*
>
> 747 Front Street, Floor 4 | San Francisco, CA 94111
>

Mime
View raw message