Return-Path: X-Original-To: apmail-flink-user-archive@minotaur.apache.org Delivered-To: apmail-flink-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 525C9192F4 for ; Thu, 28 Apr 2016 07:37:33 +0000 (UTC) Received: (qmail 61735 invoked by uid 500); 28 Apr 2016 07:37:33 -0000 Delivered-To: apmail-flink-user-archive@flink.apache.org Received: (qmail 61636 invoked by uid 500); 28 Apr 2016 07:37:33 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flink.apache.org Delivered-To: mailing list user@flink.apache.org Received: (qmail 61626 invoked by uid 99); 28 Apr 2016 07:37:33 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Apr 2016 07:37:33 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 9DCA6C06CB for ; Thu, 28 Apr 2016 07:37:32 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.18 X-Spam-Level: *** X-Spam-Status: No, score=3.18 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, KAM_BADIPHTTP=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, WEIRD_PORT=0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id nsR172-mG-b2 for ; Thu, 28 Apr 2016 07:37:30 +0000 (UTC) Received: from mail-wm0-f52.google.com (mail-wm0-f52.google.com [74.125.82.52]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 18F5A5F19A for ; Thu, 28 Apr 2016 07:37:30 +0000 (UTC) Received: by mail-wm0-f52.google.com with SMTP id g17so26123120wme.1 for ; Thu, 28 Apr 2016 00:37:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=nh4qty0GBU7zTlqfL9CVrwfA8X90TXZcT/PlEpfHnn0=; b=XHiHtzUoeVwWxamuXB3Fn5xNA4a6edcork+BQc8et9nHPf/nSrJS0OQNCCDgq2D1V9 Ej9ofgLbmOYXXgZ6wBvowcKYdqUP0W061V2EI6c5VXFnXQJhhxoHjP4K7rsZ7zlhism4 y++fGjB6FLpFwImsTrzyE/wrk3Gjpr/rM9NP47VIQN5JssIcrsj5+4c4yiekmOJj7OHs urBjn2GZ/v4Mg8xmNAr9eS1VtMzhglr0tdYeQSdp0B6lU1BrjGcpxw2isaClcC2rF9Dq MTgy+FPLX2wxt7ofqs3mpgkuQaK0+f1X+sZCL7msgBDwos1fWEcPuCcO65nZDuXCWHyC BE4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=nh4qty0GBU7zTlqfL9CVrwfA8X90TXZcT/PlEpfHnn0=; b=bSZ4UiK3BERTArfJVDafpmykj98yu4wGOfUUBni2tblGaOVq5+h+N/VEdo+IrTEKL3 DSGUW5ZP1+LNP3iGlS4ciQxDsfdBEkSNIeKIQXrObR4ZeMkCDLfI+TxwpxBGVGjxupPZ H80xbdzubxmsyVWGFTZU29A6w/mm96IIeNEydn1WLF6OP+uhlUxuNW7TWGrH3vjMdYbM whxs2I1uU9Fn+xcyDzmdP7t47equgfCCbwVvuBWRZBpQP6ZRQhndPZC1Z+s2Lwq9CuUN IRm0SIIgHcTCn/+px7OsADCIB6xNHmrrDiwwdE7rvas5+XRTcIiaMK28G4Qd01oZtA/n iaPw== X-Gm-Message-State: AOPr4FWZmO2imR8URWWDB3ajGFKrmsg+mTkrdy5bHIkFrEsbB+jJGnV/NiB0snVHNzdIXFeR3e+/ipg4e2LM3A== X-Received: by 10.28.165.209 with SMTP id o200mr28439508wme.61.1461829049010; Thu, 28 Apr 2016 00:37:29 -0700 (PDT) MIME-Version: 1.0 Received: by 10.28.168.151 with HTTP; Thu, 28 Apr 2016 00:37:09 -0700 (PDT) In-Reply-To: References: <5720DCEB.1080101@apache.org> From: Stefano Bortoli Date: Thu, 28 Apr 2016 09:37:09 +0200 Message-ID: Subject: Re: Requesting the next InputSplit failed To: user@flink.apache.org Content-Type: multipart/alternative; boundary=001a114b44ee304ec00531869923 --001a114b44ee304ec00531869923 Content-Type: text/plain; charset=UTF-8 Digging the logs, we found this: WARN Remoting - Tried to associate with unreachable remote address [akka.tcp://flink@127.0.0.1:34984]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: Connessione rifiutata: /127.0.0.1:34984 however, it is not clear why it should refuse a connection to itself after 40min of run. we'll try to figure out possible environment issues. Its a fresh installation, therefore we may have left out some configurations. saluti, Stefano 2016-04-28 9:22 GMT+02:00 Stefano Bortoli : > I had this type of exception when trying to build and test Flink on a > "small machine". I worked around the test increasing the timeout for Akka. > > > https://github.com/stefanobortoli/flink/blob/FLINK-1827/flink-tests/src/test/java/org/apache/flink/test/checkpointing/EventTimeAllWindowCheckpointingITCase.java > > it happened only on my machine (a VirtualBox I use for development), but > not on Flavio's. Is it possible that on load situations the JobManager > slows down a bit too much? > > saluti, > Stefano > > 2016-04-27 17:50 GMT+02:00 Flavio Pompermaier : > >> A precursor of the modified connector (since we started a long time ago). >> However the idea is the same, I compute the inputSplits and then I get the >> data split by split (similarly to what it happens in FLINK-3750 - >> https://github.com/apache/flink/pull/1941 ) >> >> Best, >> Flavio >> >> On Wed, Apr 27, 2016 at 5:38 PM, Chesnay Schepler >> wrote: >> >>> Are you using your modified connector or the currently available one? >>> >>> >>> On 27.04.2016 17:35, Flavio Pompermaier wrote: >>> >>> Hi to all, >>> I'm running a Flink Job on a JDBC datasource and I obtain the following >>> exception: >>> >>> java.lang.RuntimeException: Requesting the next InputSplit failed. >>> at >>> org.apache.flink.runtime.taskmanager.TaskInputSplitProvider.getNextInputSplit(TaskInputSplitProvider.java:91) >>> at >>> org.apache.flink.runtime.operators.DataSourceTask$1.hasNext(DataSourceTask.java:342) >>> at >>> org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:137) >>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559) >>> at java.lang.Thread.run(Thread.java:745) >>> Caused by: java.util.concurrent.TimeoutException: Futures timed out >>> after [10000 milliseconds] >>> at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219) >>> at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) >>> at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107) >>> at >>> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) >>> at scala.concurrent.Await$.result(package.scala:107) >>> at scala.concurrent.Await.result(package.scala) >>> at >>> org.apache.flink.runtime.taskmanager.TaskInputSplitProvider.getNextInputSplit(TaskInputSplitProvider.java:71) >>> ... 4 more >>> >>> What can be the cause? Is it because the whole DataSource reading has >>> cannot take more than 10000 milliseconds? >>> >>> Best, >>> Flavio >>> >>> >>> >> >> > --001a114b44ee304ec00531869923 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Digging the logs, we found this:

WAR= N=C2=A0 Remoting - Tried to associate with unreachable remote address [akka= .tcp://flink@127= .0.0.1:34984]. Address is now gated for 5000 ms, all messages to this address will be=20 delivered to dead letters. Reason: Connessione rifiutata: /127.0.0.1:34984

howev= er, it is not clear why it should refuse a connection to itself after 40min= of run. we'll try to figure out possible environment issues. Its a fre= sh installation, therefore we may have left out some configurations.
saluti,
Stefano

2016-04-28 9:22 GMT+02:00 Stefano Bortoli <s.bor= toli@gmail.com>:
I had this type of exception when trying to build a= nd test Flink on a "small machine". I worked around the test incr= easing the timeout for Akka.

https://github.com/stefanobortoli/flink/blob/FLINK-1827/flink-tests/src/t= est/java/org/apache/flink/test/checkpointing/EventTimeAllWindowCheckpointin= gITCase.java

it happened only on my machine (a VirtualBox = I use for development), but not on Flavio's. Is it possible that on loa= d situations the JobManager slows down a bit too much?

saluti,=
Stefano

2016-04-27 17:50 GMT+02:0= 0 Flavio Pompermaier <pompermaier@okkam.it>:
A precursor of the modified connecto= r (since we started a long time ago). However the idea is the same, I compu= te the inputSplits and then I get the data split by split (similarly to wha= t it happens in FLINK-3750 -https://github.com/apache/flink/pull/1941 )
Best,
Flavio

On Wed, Apr 27, 2016 at 5:38 PM, Ches= nay Schepler <chesnay@apache.org> wrote:
=20 =20 =20
Are you using your modified connector or the currently available one?


On 27.04.2016 17:35, Flavio Pompermaier wrote:
Hi to all,
I'm running a Flink Job on a JDBC datasource and I obtain the following exception:

java.lang.RuntimeException: Requesting the next InputSplit failed.
at org.apache.flink.runtime.taskmanager.TaskInputSplitProvider.getNextInputSpl= it(TaskInputSplitProvider.java:91)
at org.apache.flink.runtime.operators.DataSourceTask$1.hasNext(DataSourceTask.= java:342)
at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.jav= a:137)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [10000 milliseconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scal= a:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.sca= la:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:10= 7)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.sca= la:53)
at scala.concurrent.Await$.result(package.scala:107)
at scala.concurrent.Await.result(package.scala)
at org.apache.flink.runtime.taskmanager.TaskInputSplitProvider.getNextInputSpl= it(TaskInputSplitProvider.java:71)
... 4 more

What can be the cause? Is it because the whole DataSource reading has cannot take more than 10000 milliseconds?

Best,
Flavio





--001a114b44ee304ec00531869923--