From user-return-28206-archive-asf-public=cust-asf.ponee.io@flink.apache.org Mon Jun 24 09:12:43 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 37AAC180671 for ; Mon, 24 Jun 2019 11:12:43 +0200 (CEST) Received: (qmail 72189 invoked by uid 500); 24 Jun 2019 09:12:40 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@flink.apache.org Received: (qmail 72179 invoked by uid 99); 24 Jun 2019 09:12:40 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 24 Jun 2019 09:12:40 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 9FEB7C228A for ; Mon, 24 Jun 2019 09:12:39 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.901 X-Spam-Level: * X-Spam-Status: No, score=1.901 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URI_HEX=0.1] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id ATxK1of4120F for ; Mon, 24 Jun 2019 09:12:35 +0000 (UTC) Received: from mail-oi1-f180.google.com (mail-oi1-f180.google.com [209.85.167.180]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 5D9D65FD5E for ; Mon, 24 Jun 2019 09:12:35 +0000 (UTC) Received: by mail-oi1-f180.google.com with SMTP id a128so9262793oib.1 for ; Mon, 24 Jun 2019 02:12:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=SZhnn3EgQM9vRM16XR9rRsX8wVYCSGhqixKJDYVZ138=; b=QrGSKsAD3sLFyPEhV15z3joBy6IdBFbWSC34YAuc1BvdxBl+usS3aZX18hDoS4bOjB nP0/vpRsMGIC9g8Zt02rZjuPcSNNLGAYSqTBDaaQwEZNLKEwIR7cR/s5kacJTaPAtkxd wjwUwO5Xmj8xC3eWdVH8k3aMrewtOd+zoQIzj/ZaXQoIY/Nmwofs/YGOGpDsPwwqid8G N8Bz/bQi570RkxxhY+9iXJ1KSx3qTCwjec35uGB6gG+SRwuCNmPZrc2chqtzQZpQHpV/ uPrbOLxhBii2FztGj2Ov+b68qTURVOkAyUOSWH9swKLCgj3vXSi9iX0VwooURBoqv2lr KyuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=SZhnn3EgQM9vRM16XR9rRsX8wVYCSGhqixKJDYVZ138=; b=i52/JWXf+GS0baYNo6i/7MnAXL4r8Dik93cEXikFOF53MypG+o4RD1auBrBMcYwqyw AEheJFoARkOBNyGCOtzMYW/AazfNEg/KbOKcxSsNJNPfdLhULROqufB5yC9JL9MBLEk6 EHiYWgYtq138dCHd5nT6i7AKktNKpvkfkDd/PKRf+qosLdyP7b91fwsvpdgbK3BIlC5K N/kQgIxluarQ4daxaosupAiX89IFbfseiEhF1ECo9sFoMcjq6HY9rS49/HUxJgeYup4F TA3JrAw4HfKksqdzOPRIc1VnA83aul4B/0C5sKDsO/xwaj05NapwnUkqYPdDsobhzT9/ 7Mfw== X-Gm-Message-State: APjAAAXxerHdQLnpFFdJV58fa/G1LtDjRrgFTzVCtr6evmpQQMazVtgf Ns//Er/KMlZCwQbgbDreFXapG18ReIBxOA40GrM= X-Google-Smtp-Source: APXvYqxJf5NtHYAE2tZzBl/xw+JVyA7Nn8tDM+SqFie4iMT90Z5/kbS/U4y9rj+CDYqg7noxyD3GC8SyPzOvcgE1r1k= X-Received: by 2002:aca:40d5:: with SMTP id n204mr9720268oia.94.1561367548862; Mon, 24 Jun 2019 02:12:28 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Fabian Hueske Date: Mon, 24 Jun 2019 11:11:52 +0200 Message-ID: Subject: Re: Flink Kafka consumer with low latency requirement To: wang xuchen Cc: user Content-Type: multipart/alternative; boundary="0000000000001d25b0058c0e3633" --0000000000001d25b0058c0e3633 Content-Type: text/plain; charset="UTF-8" Hi Ben, Flink's Kafka consumers track their progress independent of any worker. They keep track of the reading offset for themselves (committing progress to Kafka is optional and only necessary to have progress monitoring in Kafka's metrics). As soon as a consumer reads and forwards an event, it is considered to be read. This means, the progress of the downstream worker does not influence the progress tracking at all. In case of a topic with a single partition, you can use a consumer with parallelism 1 and connect a worker task with a higher parallelism to it. The single consumer task will send the read events round-robin to the worker tasks. Best, Fabian Am Fr., 21. Juni 2019 um 05:48 Uhr schrieb wang xuchen : > > Dear Flink experts, > > I am experimenting Flink for a use case where there is a tight latency > requirements. > > A stackoverflow article suggests that I can use setParallism(n) to process > a Kafka partition in a multi-threaded way. My understanding is there is > still one kafka consumer per partition, but by using setParallelism, I can > spin up multiple worker threads to process the messages read from the > consumer. > > And according to Fabian`s comments in this link: > > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Does-Flink-Kafka-connector-has-max-pending-offsets-concept-td28119.html > Flink is able to manage the offset correctly (commit in the right order). > > Here is my questions, let`s say there is a Kafka topic with only one > partition, and I setup a consumer with setParallism(2). Hypothetically, > worker threads call out to a REST service which may get slow or stuck > periodically. If I want to make sure that the consumer overall is making > progress even in face of a 'slow woker'. In other words, I`d like to have > multiple pending but uncommitted offsets by the fast worker even when the > other worker is stuck. Is there such a knob to tune in Flink? > > From my own experiment, I use Kafka consume group tool to to monitor the > offset lag, soon as one worker thread is stuck, the other cannot make any > progress either. I really want the fast worker still progress to certain > extend. For this use case, exactly once processing is not required. > > Thanks for helping. > Ben > > > --0000000000001d25b0058c0e3633 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Ben,

Flink's Kafka co= nsumers track their progress independent of any worker.
They keep= track of the reading offset for themselves (committing progress to Kafka i= s optional and only necessary to have progress monitoring in Kafka's me= trics).
As soon as a consumer reads and forwards an event, it is = considered to be read. This means, the progress of the downstream worker do= es not influence the progress tracking at all.

In = case of a topic with a single partition, you can use a consumer with parall= elism 1 and connect a worker task with a higher parallelism to it.
The single consumer task will send the read events round-robin to th= e worker tasks.

Best, Fabian

Am Fr., 21. = Juni 2019 um 05:48=C2=A0Uhr schrieb wang xuchen <ben.wxc@gmail.com>:

Dear Flink ex= perts,

I am experimenting Flink for a use case where there is a tigh= t latency requirements.

A stackoverflow article suggests that I can= use setParallism(n) to process a Kafka partition in a multi-threaded way. = My understanding is there is still one kafka consumer per partition, but by= using setParallelism, I can spin up multiple worker threads to process the= messages read from the consumer.=C2=A0

And according to Fabian`s co= mments in this link:
=C2=A0http://apache-flink-user-= mailing-list-archive.2336050.n4.nabble.com/Does-Flink-Kafka-connector-has-m= ax-pending-offsets-concept-td28119.html
Flink is able to manage the = offset correctly (commit in the right order).

Here is my questions, = let`s say there is a Kafka topic with only one partition, and I setup a con= sumer with setParallism(2). Hypothetically,=C2=A0 worker threads call out t= o a REST service which may get slow or stuck periodically. If I want to mak= e sure that the consumer overall is making progress even in face of a '= slow woker'. In other words, I`d like to have=C2=A0 multiple pending bu= t uncommitted offsets by the fast worker even when the other worker is stuc= k. Is there such a knob=C2=A0 to tune in Flink?=C2=A0

From my own ex= periment, I use Kafka consume group tool to to monitor the offset lag,=C2= =A0 soon as one worker thread is stuck, the other cannot make any progress = either. I really want the fast worker still progress to certain extend. For= this use case, exactly once processing is not required.=C2=A0

Thank= s for helping.
Ben

=C2=A0=C2=A0
--0000000000001d25b0058c0e3633--