flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Till Rohrmann <trohrm...@apache.org>
Subject Re: operators
Date Wed, 16 Mar 2016 15:12:42 GMT
Hi Radu,

the API call slotSharingGroup was introduced with version 1.0. In the
version 0.10 there was something similar called startNewResourceGroup, but
it was somewhat broken. Therefore, I would recommend you upgrading to
version 1.0. You can find the description of the new method here [1]. The
slot sharing group basically allows you to define which operators can share
a slot. This means that a subtask O_i of operator O and another subtask P_j
of operator P will be executed in the same slot. However, you don’t have
control over which subtask will be assigned to which slot. If this
guarantee is strong enough for your use case, then you can simply upgrade
and use slotSharingGroup to define your sharing groups. Btw: Per default
all operators will be placed in the same slot sharing group.

If your requirement is that O_i will be executed in the same slot as P_i,
then you have to add the corresponding JobVertices to a CoLocationGroup. At
the moment this is not really exposed but you could try to get the JobGraph
from the StreamGraph.getJobGraph and then use JobGraph.getVertices to get
the JobVertices. Then you have to find out which JobVertices accommodate
your operators. Once this is done, you can colocate them via the
JobVertex.setStrictlyCoLocatedWith method. This might solve your problem,
but I haven’t tested it myself.

[1]
https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/index.html#task-chaining-and-resource-groups

Cheers,
Till
​

On Thu, Mar 10, 2016 at 2:52 PM, Radu Tudoran <radu.tudoran@huawei.com>
wrote:

> Hi,
>
>
>
> It would not be feasible actually to use kafka queues or the DFS. Could
> you point me at which level of API I could access the CoLocationConstraint?
> Is it accessible from the  DataSourceStream or from the operator directly?
>
>
>
> I have also dig  through the documentation and API and I was curious to
> understand a bit what can the “slotSharingGroup” and “
> startNewResouceGroup()” can do.
>
>
>
> I did not find though a good example..only this link
> https://issues.apache.org/jira/browse/FLINK-3315
>
>
>
> Also, for the “slotSharingGroup” it doesn’t seem to be available (I am
> currently using flink 0.10) – so if it is something that came newer than I
> guess this is the explanation why I cannot find it in any of datastream api
> or source function
>
>
>
> Thanks for the info.
>
>
>
>
>
> *From:* ewenstephan@gmail.com [mailto:ewenstephan@gmail.com] *On Behalf
> Of *Stephan Ewen
> *Sent:* Wednesday, March 09, 2016 6:30 PM
> *To:* user@flink.apache.org
> *Subject:* Re: operators
>
>
>
> Hi!
>
>
>
> You cannot specify that on the higher API levels. The lower API levels
> have something called "CoLocationConstraint". At this point it is not
> exposed, because we thought that would lead to not very scalable and robust
> designs in many cases
>
> .
>
> The best thing usually is location transparency and local affinity (as a
> performance optimization).
>
> Is the file large, i.e., would it hurt to do it on a DFS? Or actually use
> a Kafka Queue between the operators?
>
>
>
> Stephan
>
>
>
>
>
> On Wed, Mar 9, 2016 at 5:38 PM, Radu Tudoran <radu.tudoran@huawei.com>
> wrote:
>
> Hi,
>
>
>
> Is there any way in which you can ensure that 2 distinct operators will be
> executed on the same machine?
>
> More precisely what I am trying to do is to have a window that computes
> some metrics and will dump this locally (from the operator not from an
> output sink) and I would like to create independent of this (or event
> within the operator) a stream source to emit this data. I cannot
>
>
>
> The schema would be something as below:
>
>
>
> Stream ->  operator   -> output
>
>                     |
>
>                   Local file
>
>                       |
>
>                     Stream source -> new stream
>
>
>
> .=> the red items should go on the same machine
>
>
>
> Dr. Radu Tudoran
>
> Research Engineer - Big Data Expert
>
> IT R&D Division
>
>
>
> [image: cid:image007.jpg@01CD52EB.AD060EE0]
>
> HUAWEI TECHNOLOGIES Duesseldorf GmbH
>
> European Research Center
>
> Riesstrasse 25, 80992 München
>
>
>
> E-mail: *radu.tudoran@huawei.com <radu.tudoran@huawei.com>*
>
> Mobile: +49 15209084330
>
> Telephone: +49 891588344173
>
>
>
> HUAWEI TECHNOLOGIES Duesseldorf GmbH
> Hansaallee 205, 40549 Düsseldorf, Germany, www.huawei.com
> Registered Office: Düsseldorf, Register Court Düsseldorf, HRB 56063,
> Managing Director: Bo PENG, Wanzhou MENG, Lifang CHEN
> Sitz der Gesellschaft: Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
> Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN
>
> This e-mail and its attachments contain confidential information from
> HUAWEI, which is intended only for the person or entity whose address is
> listed above. Any use of the information contained herein in any way
> (including, but not limited to, total or partial disclosure, reproduction,
> or dissemination) by persons other than the intended recipient(s) is
> prohibited. If you receive this e-mail in error, please notify the sender
> by phone or email immediately and delete it!
>
>
>
>
>

Mime
View raw message