From user-return-21635-archive-asf-public=cust-asf.ponee.io@flink.apache.org Thu Jul 26 08:34:46 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id DBF46180621 for ; Thu, 26 Jul 2018 08:34:45 +0200 (CEST) Received: (qmail 3424 invoked by uid 500); 26 Jul 2018 06:34:44 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@flink.apache.org Received: (qmail 3413 invoked by uid 99); 26 Jul 2018 06:34:44 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Jul 2018 06:34:44 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 039A1C5616 for ; Thu, 26 Jul 2018 06:34:44 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.138 X-Spam-Level: ** X-Spam-Status: No, score=2.138 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, T_DKIMWL_WL_MED=-0.01] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id HNvvte4tECtT for ; Thu, 26 Jul 2018 06:34:43 +0000 (UTC) Received: from mail-io0-f194.google.com (mail-io0-f194.google.com [209.85.223.194]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id E4A1D5F260 for ; Thu, 26 Jul 2018 06:34:42 +0000 (UTC) Received: by mail-io0-f194.google.com with SMTP id o22-v6so486791ioh.6 for ; Wed, 25 Jul 2018 23:34:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=K4BAMjTDhYETRcYPwkwETdP7mm+8QIahdN++cH5FZqw=; b=O6lh15LfV74kfxOp4smXCzve6cG9ANe0q98Inueis1rfh23ihEvJfG80E9myfxlI81 hkoo3Plj/a/fSoKdULB8W/fM13uWnBexr9ysiAQQMvQuQk/p7w4+kWjgPOWNEJpXGwtq IXunxLZKrNExto1/4+sCN2ewQ7Bp6qlxOYNO36eAHaj9kf8LIi4wnEfRfH/EkQ0d/r/F eezCapiVpxAhWU+ph1sEPTjqhVRR6tBmLp9BE2v90pcMz8qp3S/8lrZAZQuacG9+04O7 zAnybilk0UpI8V/f1Pii2Rvz48d2A3s2rXn8EY/2J8d08bfEf0QWZbd2L042UBXPhR2p 1Ocw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=K4BAMjTDhYETRcYPwkwETdP7mm+8QIahdN++cH5FZqw=; b=ZEdgKvYQ8ne0bMs6THGI91VMpcgiaX19bjQ1/jmti18FZaXdkuER1vbuowsVy6V9ey MpJ+cRDVjPxh1qCV+GllD614/KjOCkGd4sxfhaVBhsi3OvUOZcEOyyeeQ1C3644f32Kh ZJmyhk283f7+ukuCN6ZE+lXIRjlOZNOGGl2o3WVhOQmfbHAt/W+Z7QG3iVyPaiJ1k7o4 ShQQhYWaKShv+VF/XFyzjRXxXn3O3hD8m4hWBrxhxgSu5eFD4LgwO02GsPotB9TsDt+e bDhiV8fg26WeaaMc1uxlRDniAOkGIqK+2MfDOzV2um+VMXd4kyXRf/duwr/qnJWp1XIN hghw== X-Gm-Message-State: AOUpUlHZqVWoJXIbS515wwVIe3xhwobf97rizLYIKPdCquZNFNJFKRHk CpCde0BuF4PxaHmHvB60rOA7WrTmmq7F5kadRSs= X-Google-Smtp-Source: AAOMgpeXqu8sbyKFb12zXWOfIx2uEiFzzZr34z9ULS0AMsV8t0LgopqKhoT+KfefVuPWD57qobVRD0Strh1YsJYxEf4= X-Received: by 2002:a6b:760c:: with SMTP id g12-v6mr472612iom.276.1532586882317; Wed, 25 Jul 2018 23:34:42 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a02:6015:0:0:0:0:0 with HTTP; Wed, 25 Jul 2018 23:34:41 -0700 (PDT) In-Reply-To: <3a1e3179-d3a5-9ea0-ac8e-55126973bcbb@apache.org> References: <5cbb81ff-c72f-27ad-17b5-724ac9c80a36@apache.org> <3a1e3179-d3a5-9ea0-ac8e-55126973bcbb@apache.org> From: vino yang Date: Thu, 26 Jul 2018 14:34:41 +0800 Message-ID: Subject: Re: Checkpointing not happening in Standalone HA mode To: Chesnay Schepler Cc: Vinay Patil , user Content-Type: multipart/alternative; boundary="000000000000b541370571e1304b" --000000000000b541370571e1304b Content-Type: text/plain; charset="UTF-8" Hi Vinay: Did you call specific config API refer to this documentation[1]; Can you share your job program and JM Log? Or the JM log contains the log message like this pattern "Triggering checkpoint {} @ {} for job {}."? [1]: https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/stream/state/checkpointing.html#enabling-and-configuring-checkpointing Thanks, vino. 2018-07-25 19:43 GMT+08:00 Chesnay Schepler : > Can you provide us with the job code? > > I assume that checkpointing runs properly if you submit the same job to a > normal cluster? > > > On 25.07.2018 13:15, Vinay Patil wrote: > > No error in the logs. That is why I am not able to understand why > checkpoints are not getting triggered. > > Regards, > Vinay Patil > > > On Wed, Jul 25, 2018 at 4:44 PM Vinay Patil > wrote: > >> Hi Chesnay, >> >> No error in the logs. That is why I am not able to understand why >> checkpoints are getting triggered. >> >> Regards, >> Vinay Patil >> >> >> On Wed, Jul 25, 2018 at 4:36 PM Chesnay Schepler >> wrote: >> >>> Please check the job- and taskmanager logs for anything suspicious. >>> >>> On 25.07.2018 12:33, Vinay Patil wrote: >>> >>> Hi, >>> >>> I am starting the cluster using bootstrap application where in I am >>> calling Job Manager and Task Manager main class to form the cluster. The HA >>> cluster is formed correctly and I am able to submit jobs to this cluster >>> using RemoteExecutionEnvironment but when I enable checkpointing in code I >>> do not see any checkpoints triggered on Flink UI. >>> >>> Am I missing any configurations to be set for the >>> RemoteExecutionEnvironment for checkpointing to work. >>> >>> >>> Regards, >>> Vinay Patil >>> >>> >>> > --000000000000b541370571e1304b Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Vinay:

Did you call specific config = API refer to this documentation[1];

Can you share = your job program and JM Log? Or the JM log contains the log message like th= is pattern "Triggering checkpoint {} @ {} for job {}."?


Thanks, vino.

2018-07-25 19:43 GMT+08:00 Che= snay Schepler <chesnay@apache.org>:
=20 =20 =20
Can you provide us = with the job code?

I assume that checkpointing runs properly if you submit the same job to a normal cluster?


On 25.07.2018 13:15, Vinay Patil wrote:
No error in the logs. That is why I am not able to understand why checkpoints are not getting triggered.

Regards,
Vinay Patil


On Wed, Jul 25, 2018 at 4:44 PM Vinay Patil <vinay18.patil@gm= ail.com> wrote:
Hi Chesnay,

No error in the logs. That is why I am not able to understand why checkpoints are getting triggered.

Regards,
Vinay Patil


On Wed, Jul 25, 2018 at 4:36 PM Chesnay Schepler <chesnay@apache.org> wrote:
Please check the job- and taskmanager logs for anything suspicious.

On 25.07.2018 12:33, Vinay Patil wrote:
Hi,

I am starting the cluster using bootstrap application where in I am calling Job Manager and Task Manager main class to form the cluster. The HA cluster is formed correctly and I am able to submit jobs to this cluster using RemoteExecutionEnvironment but when I enable checkpointing in code I do not see any checkpoints triggered on Flink UI.

Am I missing any configurations to be set for the RemoteExecutionEnvironment for checkpointing to work.=C2=A0


Rega= rds,
Vinay Patil




--000000000000b541370571e1304b--