flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Pompermaier <pomperma...@okkam.it>
Subject Re: Tuples serialization
Date Fri, 24 Apr 2015 07:54:32 GMT
I managed to read and write avro files and still I have two doubts:

Which size do I have to use for BLOCK_SIZE_PARAMETER_KEY?
Do I have really to create a sample tuple to extract the TypeInformation to
instantiate the TypeSerializerInputFormat?

On Thu, Apr 23, 2015 at 7:04 PM, Flavio Pompermaier <pompermaier@okkam.it>

> I've searched within flink for a working example of TypeSerializerOutputFormat
> usage but I didn't find anything usable.
> Cold you show me a simple snippet of code?
> Do I have to configure BinaryInputFormat.BLOCK_SIZE_PARAMETER_KEY? Which
> size do I have to use? Will flink write a single file or a set of avro file
> in a directory?
> Is it possible to read all files in a directory at once?
> On Thu, Apr 23, 2015 at 12:16 PM, Fabian Hueske <fhueske@gmail.com> wrote:
>> Have you tried the TypeSerializerOutputFormat?
>> This will serialize data using Flink's own serializers and write it to
>> binary files.
>> The data can be read back using the TypeSerializerInputFormat.
>> Cheers, Fabian
>> 2015-04-23 11:14 GMT+02:00 Flavio Pompermaier <pompermaier@okkam.it>:
>>> Hi to all,
>>> in my use case I'd like to persist within a directory batch of
>>> Tuple2<String, byte[]>.
>>> Which is the most efficient way to achieve that in Flink?
>>> I was thinking to use Avro but I can't find an example of how to do
>>> that.
>>> Once generated how can I (re)generate a Dataset<Tuple2<String, byte[]>
>>> from it?
>>> Best,
>>> Flavio

View raw message