flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ufuk Celebi <...@apache.org>
Subject Re: Scala Code Generation
Date Wed, 14 Oct 2015 09:30:14 GMT

> On 13 Oct 2015, at 16:06, schultze@informatik.hu-berlin.de wrote:
> 
> Hello,
> 
> I am currently working on a compilation unit translating AsterixDB's AQL
> into runnable Scala code for Flink's Scala API. During code generation I
> discovered some things that are quite hard to work around. I am still
> working with Flink version 0.8, so some of the problems I have might
> already be fixed in 0.9 and if so please tell me.
> 
> First, whenever a record gets projected down to only a single field (e.g.
> by a map or reduce function) it is no longer considered a record, but a
> variable of the type of that field. If afterwards I want to apply
> additional functions like .sum(0) I get an error message like

A workaround is to return Tuple1<X> for this. Then you can run the aggregation. I think
that the Tuple0 class has been added after 0.8 though.

> "Aggregating on field positions is only possible on tuple data types."
> 
> This is the same for all functions (like write or join) as the "record" is
> no longer considered a dataset.

What do you mean? At least in the current versions, the join projections return a Tuple type
as well.

> Second, I found that records longer than 22 fields are not supported.
> Whenever I have a record that is longer than that I receive a build error
> as

Flink’s Tuple classes go up to Tuple25. You can work around this by using a custom PoJo
type, e.g.

class TPCHRecord {
    public int f0;
    ...
    public int f99;
}

If possible, I would suggest to update to the latest 0.9 or the upcoming 0.10 release. A lot
of stuff has been fixed since 0.8. I think it will be worth it. If you encounter any problems
while doing this, feel free to ask here. :)

– Ufuk
Mime
View raw message