flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "LINZ, Arnaud" <AL...@bouyguestelecom.fr>
Subject RE: Join Stream with big ref table
Date Fri, 13 Nov 2015 08:18:03 GMT

I’ve worked around my problem by not using the HiveServer2 JDBC driver to read the ref table.
Apparently, despite all the good options passed to the Statement object, it poorly handles
RAM, since converting the table into textformat and directly reading the hdfs works without
any problem and with a lot of free mem…


De : LINZ, Arnaud
Envoyé : jeudi 12 novembre 2015 17:48
À : 'user@flink.apache.org' <user@flink.apache.org>
Objet : Join Stream with big ref table


I have to enrich a stream with a big reference table (11,000,000 rows). I cannot use “join”
because I cannot window the stream ; so in the “open()” function of each mapper I read
the content of the table and put it in a HashMap (stored on the heap).

11M rows is quite big but it should take less than 100Mb in RAM, so it’s supposed to be
easy. However, I systematically run into a Java Out Of Memory error, even with huge 64Gb containers
(5 slots / container).

Path, ID

Data Port

Last Heartbeat

All Slots

Free Slots

CPU Cores

Physical Memory

Free Memory

Flink Managed Memory



2015-11-12, 17:46:14




126.0 GB

46.0 GB

31.5 GB

I don’t clearly understand why this happens and how to fix it. Any clue?


L'intégrité de ce message n'étant pas assurée sur internet, la société expéditrice
ne peut être tenue responsable de son contenu ni de ses pièces jointes. Toute utilisation
ou diffusion non autorisée est interdite. Si vous n'êtes pas destinataire de ce message,
merci de le détruire et d'avertir l'expéditeur.

The integrity of this message cannot be guaranteed on the Internet. The company that sent
this message cannot therefore be held liable for its content nor attachments. Any unauthorized
use or dissemination is prohibited. If you are not the intended recipient of this message,
then please delete it and notify the sender.
View raw message