beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Etienne Chauchot (JIRA)" <>
Subject [jira] [Commented] (BEAM-2993) AvroIO.write without specifying a schema
Date Mon, 02 Oct 2017 07:49:01 GMT


Etienne Chauchot commented on BEAM-2993:

Hi [~jkff], thanks for the pointer on debugging serialization issues. Indeed I forgot to declare
{{GenericRecordAvroDestinations}} static, oops :)

The unit test does not illustrate the use case. If it did, it would be very weird to have
the schema and do not use it in the write part. The UTest only tests the schema less *write*
part, pretending the schema is unknown. The input data is created as simply as possible (with
know schema). I do not test schema less read because it is already tested elsewhere but in
a concrete use case, we would do a schemaless read, transformations and schemaless write.

> AvroIO.write without specifying a schema
> ----------------------------------------
>                 Key: BEAM-2993
>                 URL:
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-extensions
>            Reporter: Etienne Chauchot
>            Assignee: Etienne Chauchot
> Similarly to, we should be able to write
to avro files using {{AvroIO}} without specifying a schema at build time. Consider the following
use case: a user has a {{PCollection<GenericRecord>}}  but the schema is only known
while running the pipeline.  {{AvroIO.writeGenericRecords}} needs the schema, but the schema
is already available in {{GenericRecord}}. We should be able to call {{AvroIO.writeGenericRecords()}}
with no schema.

This message was sent by Atlassian JIRA

View raw message