flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Federico D'Ambrosio" <fedex...@gmail.com>
Subject Re: BlobCache and its functioning
Date Thu, 31 Aug 2017 09:51:27 GMT
Hi,

1) I'm using Flink 1.3.2

2) Th JobManager log is pretty much the same concerning those lines:

2017-08-30 14:16:53,343 INFO
org.apache.zookeeper.ClientCnxn                               - Opening
socket connection to server master-1.localdomain/10.0.0.55:2181
2017-08-30 14:16:53,344 INFO
org.apache.zookeeper.ClientCnxn                               - Socket
connection established to master-1.localdomain/10.0.0.55:2181, initiating
session
2017-08-30 14:16:53,348 INFO
org.apache.zookeeper.ClientCnxn                               - Session
establishment complete on server master-1.localdomain/10.0.0.55:2181,
sessionid = 0x15e326a8fe6000e, negotiated timeout = 40000
2017-08-30 14:16:53,467 INFO
org.apache.flink.streaming.connectors.kinesis.internals.KinesisDataFetcher
- Subtask 0 has no active shards to read on startup; marking the subtask as
temporarily idle ...
2017-08-30 14:16:53,469 INFO
org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer  -
Subtask 1 will be seeded with initial shard
StreamShardHandle{streamName='fdt', shard='{ShardId:
shardId-000000000000,HashKeyRange: {StartingHashKey: 0,EndingHashKey:
340282366920938463463374607431768211455},SequenceNumberRange:
{StartingSequenceNumber:
49576318619583361934524589342704326364484033599642796034,}}'}, starting
state set as sequence number LATEST_SEQUENCE_NUM
2017-08-30 14:16:53,470 INFO
org.apache.flink.streaming.connectors.kinesis.internals.KinesisDataFetcher
- Subtask 1 will start consuming seeded shard
StreamShardHandle{streamName='fdt', shard='{ShardId:
shardId-000000000000,HashKeyRange: {StartingHashKey: 0,EndingHashKey:
340282366920938463463374607431768211455},SequenceNumberRange:
{StartingSequenceNumber:
49576318619583361934524589342704326364484033599642796034,}}'} from sequence
number LATEST_SEQUENCE_NUM with ShardConsumer 0
2017-08-30 14:16:53,608 INFO
lab.vardata.HBaseBatchFormat                                  - Task 0:
Opening connection to currentDay to execute 1 tasks on Single Put job

2017-08-30 14:17:21,318 INFO  org.apache.flink.runtime.blob.
BlobCache                       - Created BLOB cache storage directory
/tmp/blobStore-8a2a96af-b836-4c95-b79a-a4b80929126f
2017-08-30 14:17:21,321 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - PUT content addressable BLOB stream to /
127.0.0.1:59937
2017-08-30 14:17:21,323 DEBUG org.apache.flink.runtime.blob.
BlobServerConnection            - Received PUT request for content
addressable BLOB
2017-08-30 14:17:21,324 INFO  org.apache.flink.runtime.blob.
BlobCache                       - Downloading
3ff486dff4c4eaafdab42b30a877326e62bfca82
from localhost/127.0.0.1:43268
2017-08-30 14:17:21,324 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - GET content addressable BLOB
3ff486dff4c4eaafdab42b30a877326e62bfca82 from /127.0.0.1:59938
2017-08-30 14:18:13,708 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - PUT content addressable BLOB stream to /
127.0.0.1:59976
2017-08-30 14:18:13,708 DEBUG org.apache.flink.runtime.blob.
BlobServerConnection            - Received PUT request for content
addressable BLOB
2017-08-30 14:18:13,710 INFO  org.apache.flink.runtime.blob.
BlobCache                       - Downloading
2f5283326aab77faa047b705cd1d6470035b3b7d
from localhost/127.0.0.1:43268
2017-08-30 14:18:13,710 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - GET content addressable BLOB
2f5283326aab77faa047b705cd1d6470035b3b7d from /127.0.0.1:59978
2017-08-30 14:19:29,811 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - PUT content addressable BLOB stream to /
127.0.0.1:60022
2017-08-30 14:19:29,812 DEBUG org.apache.flink.runtime.blob.
BlobServerConnection            - Received PUT request for content
addressable BLOB
2017-08-30 14:19:29,814 INFO  org.apache.flink.runtime.blob.
BlobCache                       - Downloading
f91fd7ecec6f90809f52ee189cb48aa1e30b04f6
from localhost/127.0.0.1:43268
2017-08-30 14:19:29,814 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - GET content addressable BLOB
f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from /127.0.0.1:60024
2017-08-30 14:21:42,856 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - PUT content addressable BLOB stream to /
127.0.0.1:60110
2017-08-30 14:21:42,856 DEBUG org.apache.flink.runtime.blob.
BlobServerConnection            - Received PUT request for content
addressable BLOB
2017-08-30 14:21:42,858 INFO  org.apache.flink.runtime.blob.
BlobCache                       - Downloading
8638bdf78b0e540786de6c291f710a8db447a2b4
from localhost/127.0.0.1:43268
2017-08-30 14:21:42,859 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - GET content addressable BLOB
8638bdf78b0e540786de6c291f710a8db447a2b4 from /127.0.0.1:60112
2017-08-30 14:26:11,242 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - PUT content addressable BLOB stream to /
127.0.0.1:60295
2017-08-30 14:26:11,243 DEBUG org.apache.flink.runtime.blob.
BlobServerConnection            - Received PUT request for content
addressable BLOB
2017-08-30 14:26:11,247 INFO  org.apache.flink.runtime.blob.
BlobCache                       - Downloading
6d30c88539d511bb9acc13b53bb2a128614f5621
from localhost/127.0.0.1:43268
2017-08-30 14:26:11,247 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - GET content addressable BLOB
6d30c88539d511bb9acc13b53bb2a128614f5621 from /127.0.0.1:60297
2017-08-30 14:29:20,942 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - PUT content addressable BLOB stream to /
127.0.0.1:60410
2017-08-30 14:29:20,943 DEBUG org.apache.flink.runtime.blob.
BlobServerConnection            - Received PUT request for content
addressable BLOB
2017-08-30 14:29:20,945 INFO  org.apache.flink.runtime.blob.
BlobCache                       - Downloading
345ae4be7a2a8077ecefd6a155d7e975b38efd48
from localhost/127.0.0.1:43268
2017-08-30 14:29:20,946 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - GET content addressable BLOB
345ae4be7a2a8077ecefd6a155d7e975b38efd48 from /127.0.0.1:60412
2017-08-30 14:32:10,004 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - PUT content addressable BLOB stream to /
127.0.0.1:60525
2017-08-30 14:32:10,004 DEBUG org.apache.flink.runtime.blob.
BlobServerConnection            - Received PUT request for content
addressable BLOB
2017-08-30 14:32:10,006 INFO  org.apache.flink.runtime.blob.
BlobCache                       - Downloading
709e2f64a484ff39634fe53d50dc760ceefd1f8e
from localhost/127.0.0.1:43268
2017-08-30 14:32:10,006 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - GET content addressable BLOB
709e2f64a484ff39634fe53d50dc760ceefd1f8e from /127.0.0.1:60527
2017-08-30 14:37:51,282 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - PUT content addressable BLOB stream to /
127.0.0.1:60734
2017-08-30 14:37:51,283 DEBUG org.apache.flink.runtime.blob.
BlobServerConnection            - Received PUT request for content
addressable BLOB
2017-08-30 14:37:51,285 INFO  org.apache.flink.runtime.blob.
BlobCache                       - Downloading
98fce553f71d4cc5a11c4345482e6cc3a37a91e3
from localhost/127.0.0.1:43268
2017-08-30 14:37:51,286 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - GET content addressable BLOB
98fce553f71d4cc5a11c4345482e6cc3a37a91e3 from /127.0.0.1:60736
2017-08-30 14:37:54,222 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - PUT content addressable BLOB stream to /
127.0.0.1:60740
2017-08-30 14:37:54,223 DEBUG org.apache.flink.runtime.blob.
BlobServerConnection            - Received PUT request for content
addressable BLOB
2017-08-30 14:37:54,225 INFO  org.apache.flink.runtime.blob.
BlobCache                       - Downloading
96387068e0bf71c5ff2019dbb1a43b74fb53067a
from localhost/127.0.0.1:43268
2017-08-30 14:37:54,225 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - GET content addressable BLOB
96387068e0bf71c5ff2019dbb1a43b74fb53067a from /127.0.0.1:60742
2017-08-30 14:40:44,005 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - PUT content addressable BLOB stream to /
127.0.0.1:60852
2017-08-30 14:40:44,006 DEBUG org.apache.flink.runtime.blob.
BlobServerConnection            - Received PUT request for content
addressable BLOB
2017-08-30 14:40:44,007 INFO  org.apache.flink.runtime.blob.
BlobCache                       - Downloading
430066a8a3febb68ada10d78663b0972b92e66e5
from localhost/127.0.0.1:43268
2017-08-30 14:40:44,008 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - GET content addressable BLOB
430066a8a3febb68ada10d78663b0972b92e66e5 from /127.0.0.1:60854
2017-08-30 14:45:52,671 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - PUT content addressable BLOB stream to /
127.0.0.1:32820
2017-08-30 14:45:52,671 DEBUG org.apache.flink.runtime.blob.
BlobServerConnection            - Received PUT request for content
addressable BLOB
2017-08-30 14:45:52,672 INFO  org.apache.flink.runtime.blob.
BlobCache                       - Downloading
ff0c1f371621c2f4ffb7ed06f448841a8a2458ac
from localhost/127.0.0.1:43268
2017-08-30 14:45:52,673 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - GET content addressable BLOB
ff0c1f371621c2f4ffb7ed06f448841a8a2458ac from /127.0.0.1:32822
2017-08-30 15:15:24,635 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - PUT content addressable BLOB stream to /
127.0.0.1:33955
2017-08-30 15:15:24,636 DEBUG org.apache.flink.runtime.blob.
BlobServerConnection            - Received PUT request for content
addressable BLOB
2017-08-30 15:15:24,638 INFO  org.apache.flink.runtime.blob.
BlobCache                       - Downloading
291f5aeb9306cb94098255237e01ab6735cf42ea
from localhost/127.0.0.1:43268
2017-08-30 15:15:24,638 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - GET content addressable BLOB
291f5aeb9306cb94098255237e01ab6735cf42ea from /127.0.0.1:33957
2017-08-30 15:21:04,189 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - PUT content addressable BLOB stream to /
127.0.0.1:34170
2017-08-30 15:21:04,189 DEBUG org.apache.flink.runtime.blob.
BlobServerConnection            - Received PUT request for content
addressable BLOB
2017-08-30 15:21:04,191 INFO  org.apache.flink.runtime.blob.
BlobCache                       - Downloading
e46b51d3bd0476b6a8a656469e7546cd933fa478
from localhost/127.0.0.1:43268
2017-08-30 15:21:04,191 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - GET content addressable BLOB
e46b51d3bd0476b6a8a656469e7546cd933fa478 from /127.0.0.1:34172
2017-08-30 15:22:10,141 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - PUT content addressable BLOB stream to /
127.0.0.1:34223
2017-08-30 15:22:10,142 DEBUG org.apache.flink.runtime.blob.
BlobServerConnection            - Received PUT request for content
addressable BLOB
2017-08-30 15:22:10,144 INFO  org.apache.flink.runtime.blob.
BlobCache                       - Downloading
cf9700032c229b39e634eeda73284e116314f7bb
from localhost/127.0.0.1:43268
2017-08-30 15:22:10,144 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - GET content addressable BLOB
cf9700032c229b39e634eeda73284e116314f7bb from /127.0.0.1:34225
2017-08-30 15:30:10,133 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - PUT content addressable BLOB stream to /
127.0.0.1:34544
2017-08-30 15:30:10,133 DEBUG org.apache.flink.runtime.blob.
BlobServerConnection            - Received PUT request for content
addressable BLOB
2017-08-30 15:30:10,135 INFO  org.apache.flink.runtime.blob.
BlobCache                       - Downloading
3b1d4b4d4820c2b74fd4a45d5741f2df940f5189
from localhost/127.0.0.1:43268
2017-08-30 15:30:10,136 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - GET content addressable BLOB
3b1d4b4d4820c2b74fd4a45d5741f2df940f5189 from /127.0.0.1:34546
2017-08-30 16:01:58,159 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - PUT content addressable BLOB stream to /
127.0.0.1:35741
2017-08-30 16:01:58,160 DEBUG org.apache.flink.runtime.blob.
BlobServerConnection            - Received PUT request for content
addressable BLOB
2017-08-30 16:01:58,162 INFO  org.apache.flink.runtime.blob.
BlobCache                       - Downloading
6bde2f7a709181065c6710c2252a5846f361ad68
from localhost/127.0.0.1:43268
2017-08-30 16:01:58,162 DEBUG org.apache.flink.runtime.blob.
BlobClient                      - GET content addressable BLOB
6bde2f7a709181065c6710c2252a5846f361ad68 from /127.0.0.1:35743

3) There actually was CPU load, but I thought Flink was stuck in a loop or
something, because as you can see from the timestamp on the logs these
lines went on for a couple hours with no sign of actual writes on HBase
(before crashing for unrelated reasons, Kinesis stream was deleted). Since
I was writing on HBase, which I was accessing via Zookeeper, I was
expecting to see the Logs I put inside the writeRecord method of the
OutputFormat.
As for the jstack trace, I'm currently unable to provide one (because I
can't access the kinesis stream right now), but I'll try to emulate the
stream and provide that stack as soon as I can.

4) Yes, those were log lines from the TM log in the web dashboard

Thank you very much for your help


2017-08-31 10:29 GMT+02:00 Nico Kruber <nico@data-artisans.com>:

> Hi Federico,
> 1) Which version of Flink are you using?
> 2) Can you also share the JobManager log?
> 3) Why do you think, Flink is stuck at the BlobCache? Is it really
> blocked, or
> do you still have CPU load? Can you post stack traces of the TaskManager
> (TM)
> and JobManager processes when you think they are stuck (using jstack)?
> 4) These PUT requests in the TM logs are strange, unless you showed the TM
> logs in the web interface - did you?
>
>
> Nico
>
>
> On Thursday, 31 August 2017 09:45:59 CEST Fabian Hueske wrote:
> > Hi Federico,
> >
> > Not sure what's going on there but Nico (in CC) is more familiar with the
> > blob cache and might be able to help.
> >
> > Best, Fabian
> >
> > 2017-08-30 15:35 GMT+02:00 Federico D'Ambrosio <fedexist@gmail.com>:
> > > Hi,
> > >
> > > I have a rather simple Flink job which has a KinesisConsumer as a
> source
> > > and an HBase table as sink, in which I write using writeOutputFormat.
> I'm
> > > running it on a local machine with a single taskmanager (2 slots, 2G).
> The
> > > KinesisConsumer works fine and the connection to the HBase table gets
> > > opened fine (i.e. the open method of the class implementing
> OutputFormat
> > > gets actually called).
> > >
> > > I'm running the job at a parallelism of 2, while the sink has a
> > > parallelism of 1. The
> > >
> > > Still, looking at the log I see that after opening the connection, the
> job
> > > gets stuck at lines like this one:
> > >
> > > INFO  org.apache.flink.runtime.blob.BlobCache                       -
> > > Downloading 8638bdf78b0e540786de6c291f710a8db447a2b4 from
> > > localhost/127.0.0.1:43268
> > >
> > > Each following one another, like this:
> > >
> > > 2017-08-30 14:17:21,318 INFO  org.apache.flink.runtime.blob.BlobCache
> > >                  - Created BLOB cache storage directory
> > > /tmp/blobStore-8a2a96af-b836-4c95-b79a-a4b80929126f 2017-08-30
> > > 14:17:21,321 DEBUG org.apache.flink.runtime.blob.BlobClient
> > >        - PUT content addressable BLOB stream to /127.0.0.1:59937
> > > 2017-08-30 14:17:21,323 DEBUG
> > > org.apache.flink.runtime.blob.BlobServerConnection            -
> Received
> > > PUT request for content addressable BLOB 2017-08-30 14:17:21,324 INFO
> > > org.apache.flink.runtime.blob.BlobCache                       -
> > > Downloading 3ff486dff4c4eaafdab42b30a877326e62bfca82 from
> > > localhost/127.0.0.1:43268 2017-08-30 14:17:21,324 DEBUG
> > > org.apache.flink.runtime.blob.BlobClient                      - GET
> > > content addressable BLOB 3ff486dff4c4eaafdab42b30a877326e62bfca82 from
> > > /127.0.0.1:59938 2017-08-30 14:18:13,708 DEBUG
> > > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > > content addressable BLOB stream to /127.0.0.1:59976 2017-08-30
> > > 14:18:13,708 DEBUG org.apache.flink.runtime.blob.BlobServerConnection
> > >        - Received PUT request for content addressable BLOB 2017-08-30
> > > 14:18:13,710 INFO  org.apache.flink.runtime.blob.BlobCache
> > >        - Downloading 2f5283326aab77faa047b705cd1d6470035b3b7d from
> > > localhost/127.0.0.1:43268 2017-08-30 14:18:13,710 DEBUG
> > > org.apache.flink.runtime.blob.BlobClient                      - GET
> > > content addressable BLOB 2f5283326aab77faa047b705cd1d6470035b3b7d from
> > > /127.0.0.1:59978 2017-08-30 14:19:29,811 DEBUG
> > > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > > content addressable BLOB stream to /127.0.0.1:60022 2017-08-30
> > > 14:19:29,812 DEBUG org.apache.flink.runtime.blob.BlobServerConnection
> > >        - Received PUT request for content addressable BLOB 2017-08-30
> > > 14:19:29,814 INFO  org.apache.flink.runtime.blob.BlobCache
> > >        - Downloading f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from
> > > localhost/127.0.0.1:43268 2017-08-30 14:19:29,814 DEBUG
> > > org.apache.flink.runtime.blob.BlobClient                      - GET
> > > content addressable BLOB f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from
> > > /127.0.0.1:60024 2017-08-30 14:21:42,856 DEBUG
> > > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > > content addressable BLOB stream to /127.0.0.1:60110 2017-08-30
> > > 14:21:42,856 DEBUG org.apache.flink.runtime.blob.BlobServerConnection
> > >        - Received PUT request for content addressable BLOB 2017-08-30
> > > 14:21:42,858 INFO  org.apache.flink.runtime.blob.BlobCache
> > >        - Downloading 8638bdf78b0e540786de6c291f710a8db447a2b4 from
> > > localhost/127.0.0.1:43268 2017-08-30 14:21:42,859 DEBUG
> > > org.apache.flink.runtime.blob.BlobClient                      - GET
> > > content addressable BLOB 8638bdf78b0e540786de6c291f710a8db447a2b4 from
> > > /127.0.0.1:60112 2017-08-30 14:26:11,242 DEBUG
> > > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > > content addressable BLOB stream to /127.0.0.1:60295 2017-08-30
> > > 14:26:11,243 DEBUG org.apache.flink.runtime.blob.BlobServerConnection
> > >        - Received PUT request for content addressable BLOB 2017-08-30
> > > 14:26:11,247 INFO  org.apache.flink.runtime.blob.BlobCache
> > >        - Downloading 6d30c88539d511bb9acc13b53bb2a128614f5621 from
> > > localhost/127.0.0.1:43268 2017-08-30 14:26:11,247 DEBUG
> > > org.apache.flink.runtime.blob.BlobClient                      - GET
> > > content addressable BLOB 6d30c88539d511bb9acc13b53bb2a128614f5621 from
> > > /127.0.0.1:60297 2017-08-30 14:29:20,942 DEBUG
> > > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > > content addressable BLOB stream to /127.0.0.1:60410
> > >
> > >
> > > My questions are: what is the jobmanager doing here? Why is he taking
> ages
> > > to do this? How do i speed up this behaviour?
> > >
> > > Thank you very much for your attention,
> > >
> > > Federico D'Ambrosio
>
>


-- 
Federico D'Ambrosio

Mime
View raw message