flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chesnay Schepler <chesnay.schep...@fu-berlin.de>
Subject Re: how to split data-sets efficiently?
Date Sun, 27 Jul 2014 11:37:22 GMT
i think this is what martin is currently doing:

StringIDs --map-> (StringIDs,LongIDs) --map-> LongIDs

and he wants to use both the second and third set. he asks for a way to 
replace the second map operation. (since it seems unnecessary to create 
an extra map for that)

i believe the appropriate way would be to use projections instead of a 
map operation. something like:

mapped = stringIDs.map(...)
longids = mapped.project(1).types(Long)

you would end up with a Tuple1 set though.

On 27.7.2014 13:21, Ufuk Celebi wrote:
> Hey Martin,
> On 27 Jul 2014, at 12:56, Martin Neumann <mneumann@spotify.com> wrote:
>> Is there a way to do a operation that allows for more the one output set
>> (basically split a set into 2 sets)? This would reduce the complexity of
>> the code a lot.
> What exactly do you mean with split?
> I am not sure if this is what you want, but you can just apply two transformations on
the same input data set.
> DataSet<String> input = ...;
> DataSet<String> firstSet = input.map(...)
> DataSet<String> secondSet = input.map(...)
> Does this help?

View raw message