flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nico Kruber <n...@data-artisans.com>
Subject Re: BlobCache and its functioning
Date Thu, 31 Aug 2017 10:24:46 GMT
to sum up: the lines you were seeing seem to be the down- and upload of the 
TaskManager logs from the web interface which go through the BlobServer and 
its components.


Nico

On Thursday, 31 August 2017 11:51:27 CEST Federico D'Ambrosio wrote:
> Hi,
> 
> 1) I'm using Flink 1.3.2
> 
> 2) Th JobManager log is pretty much the same concerning those lines:
> 
> 2017-08-30 14:16:53,343 INFO
> org.apache.zookeeper.ClientCnxn                               - Opening
> socket connection to server master-1.localdomain/10.0.0.55:2181
> 2017-08-30 14:16:53,344 INFO
> org.apache.zookeeper.ClientCnxn                               - Socket
> connection established to master-1.localdomain/10.0.0.55:2181, initiating
> session
> 2017-08-30 14:16:53,348 INFO
> org.apache.zookeeper.ClientCnxn                               - Session
> establishment complete on server master-1.localdomain/10.0.0.55:2181,
> sessionid = 0x15e326a8fe6000e, negotiated timeout = 40000
> 2017-08-30 14:16:53,467 INFO
> org.apache.flink.streaming.connectors.kinesis.internals.KinesisDataFetcher
> - Subtask 0 has no active shards to read on startup; marking the subtask as
> temporarily idle ...
> 2017-08-30 14:16:53,469 INFO
> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer  -
> Subtask 1 will be seeded with initial shard
> StreamShardHandle{streamName='fdt', shard='{ShardId:
> shardId-000000000000,HashKeyRange: {StartingHashKey: 0,EndingHashKey:
> 340282366920938463463374607431768211455},SequenceNumberRange:
> {StartingSequenceNumber:
> 49576318619583361934524589342704326364484033599642796034,}}'}, starting
> state set as sequence number LATEST_SEQUENCE_NUM
> 2017-08-30 14:16:53,470 INFO
> org.apache.flink.streaming.connectors.kinesis.internals.KinesisDataFetcher
> - Subtask 1 will start consuming seeded shard
> StreamShardHandle{streamName='fdt', shard='{ShardId:
> shardId-000000000000,HashKeyRange: {StartingHashKey: 0,EndingHashKey:
> 340282366920938463463374607431768211455},SequenceNumberRange:
> {StartingSequenceNumber:
> 49576318619583361934524589342704326364484033599642796034,}}'} from sequence
> number LATEST_SEQUENCE_NUM with ShardConsumer 0
> 2017-08-30 14:16:53,608 INFO
> lab.vardata.HBaseBatchFormat                                  - Task 0:
> Opening connection to currentDay to execute 1 tasks on Single Put job
> 
> 2017-08-30 14:17:21,318 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Created BLOB cache storage directory
> /tmp/blobStore-8a2a96af-b836-4c95-b79a-a4b80929126f
> 2017-08-30 14:17:21,321 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:59937
> 2017-08-30 14:17:21,323 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:17:21,324 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 3ff486dff4c4eaafdab42b30a877326e62bfca82
> from localhost/127.0.0.1:43268
> 2017-08-30 14:17:21,324 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 3ff486dff4c4eaafdab42b30a877326e62bfca82 from /127.0.0.1:59938
> 2017-08-30 14:18:13,708 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:59976
> 2017-08-30 14:18:13,708 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:18:13,710 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 2f5283326aab77faa047b705cd1d6470035b3b7d
> from localhost/127.0.0.1:43268
> 2017-08-30 14:18:13,710 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 2f5283326aab77faa047b705cd1d6470035b3b7d from /127.0.0.1:59978
> 2017-08-30 14:19:29,811 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:60022
> 2017-08-30 14:19:29,812 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:19:29,814 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> f91fd7ecec6f90809f52ee189cb48aa1e30b04f6
> from localhost/127.0.0.1:43268
> 2017-08-30 14:19:29,814 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from /127.0.0.1:60024
> 2017-08-30 14:21:42,856 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:60110
> 2017-08-30 14:21:42,856 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:21:42,858 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 8638bdf78b0e540786de6c291f710a8db447a2b4
> from localhost/127.0.0.1:43268
> 2017-08-30 14:21:42,859 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 8638bdf78b0e540786de6c291f710a8db447a2b4 from /127.0.0.1:60112
> 2017-08-30 14:26:11,242 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:60295
> 2017-08-30 14:26:11,243 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:26:11,247 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 6d30c88539d511bb9acc13b53bb2a128614f5621
> from localhost/127.0.0.1:43268
> 2017-08-30 14:26:11,247 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 6d30c88539d511bb9acc13b53bb2a128614f5621 from /127.0.0.1:60297
> 2017-08-30 14:29:20,942 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:60410
> 2017-08-30 14:29:20,943 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:29:20,945 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 345ae4be7a2a8077ecefd6a155d7e975b38efd48
> from localhost/127.0.0.1:43268
> 2017-08-30 14:29:20,946 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 345ae4be7a2a8077ecefd6a155d7e975b38efd48 from /127.0.0.1:60412
> 2017-08-30 14:32:10,004 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:60525
> 2017-08-30 14:32:10,004 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:32:10,006 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 709e2f64a484ff39634fe53d50dc760ceefd1f8e
> from localhost/127.0.0.1:43268
> 2017-08-30 14:32:10,006 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 709e2f64a484ff39634fe53d50dc760ceefd1f8e from /127.0.0.1:60527
> 2017-08-30 14:37:51,282 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:60734
> 2017-08-30 14:37:51,283 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:37:51,285 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 98fce553f71d4cc5a11c4345482e6cc3a37a91e3
> from localhost/127.0.0.1:43268
> 2017-08-30 14:37:51,286 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 98fce553f71d4cc5a11c4345482e6cc3a37a91e3 from /127.0.0.1:60736
> 2017-08-30 14:37:54,222 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:60740
> 2017-08-30 14:37:54,223 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:37:54,225 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 96387068e0bf71c5ff2019dbb1a43b74fb53067a
> from localhost/127.0.0.1:43268
> 2017-08-30 14:37:54,225 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 96387068e0bf71c5ff2019dbb1a43b74fb53067a from /127.0.0.1:60742
> 2017-08-30 14:40:44,005 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:60852
> 2017-08-30 14:40:44,006 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:40:44,007 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 430066a8a3febb68ada10d78663b0972b92e66e5
> from localhost/127.0.0.1:43268
> 2017-08-30 14:40:44,008 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 430066a8a3febb68ada10d78663b0972b92e66e5 from /127.0.0.1:60854
> 2017-08-30 14:45:52,671 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:32820
> 2017-08-30 14:45:52,671 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 14:45:52,672 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> ff0c1f371621c2f4ffb7ed06f448841a8a2458ac
> from localhost/127.0.0.1:43268
> 2017-08-30 14:45:52,673 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> ff0c1f371621c2f4ffb7ed06f448841a8a2458ac from /127.0.0.1:32822
> 2017-08-30 15:15:24,635 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:33955
> 2017-08-30 15:15:24,636 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 15:15:24,638 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 291f5aeb9306cb94098255237e01ab6735cf42ea
> from localhost/127.0.0.1:43268
> 2017-08-30 15:15:24,638 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 291f5aeb9306cb94098255237e01ab6735cf42ea from /127.0.0.1:33957
> 2017-08-30 15:21:04,189 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:34170
> 2017-08-30 15:21:04,189 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 15:21:04,191 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> e46b51d3bd0476b6a8a656469e7546cd933fa478
> from localhost/127.0.0.1:43268
> 2017-08-30 15:21:04,191 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> e46b51d3bd0476b6a8a656469e7546cd933fa478 from /127.0.0.1:34172
> 2017-08-30 15:22:10,141 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:34223
> 2017-08-30 15:22:10,142 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 15:22:10,144 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> cf9700032c229b39e634eeda73284e116314f7bb
> from localhost/127.0.0.1:43268
> 2017-08-30 15:22:10,144 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> cf9700032c229b39e634eeda73284e116314f7bb from /127.0.0.1:34225
> 2017-08-30 15:30:10,133 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:34544
> 2017-08-30 15:30:10,133 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 15:30:10,135 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 3b1d4b4d4820c2b74fd4a45d5741f2df940f5189
> from localhost/127.0.0.1:43268
> 2017-08-30 15:30:10,136 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 3b1d4b4d4820c2b74fd4a45d5741f2df940f5189 from /127.0.0.1:34546
> 2017-08-30 16:01:58,159 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - PUT content addressable BLOB stream to /
> 127.0.0.1:35741
> 2017-08-30 16:01:58,160 DEBUG org.apache.flink.runtime.blob.
> BlobServerConnection            - Received PUT request for content
> addressable BLOB
> 2017-08-30 16:01:58,162 INFO  org.apache.flink.runtime.blob.
> BlobCache                       - Downloading
> 6bde2f7a709181065c6710c2252a5846f361ad68
> from localhost/127.0.0.1:43268
> 2017-08-30 16:01:58,162 DEBUG org.apache.flink.runtime.blob.
> BlobClient                      - GET content addressable BLOB
> 6bde2f7a709181065c6710c2252a5846f361ad68 from /127.0.0.1:35743
> 
> 3) There actually was CPU load, but I thought Flink was stuck in a loop or
> something, because as you can see from the timestamp on the logs these
> lines went on for a couple hours with no sign of actual writes on HBase
> (before crashing for unrelated reasons, Kinesis stream was deleted). Since
> I was writing on HBase, which I was accessing via Zookeeper, I was
> expecting to see the Logs I put inside the writeRecord method of the
> OutputFormat.
> As for the jstack trace, I'm currently unable to provide one (because I
> can't access the kinesis stream right now), but I'll try to emulate the
> stream and provide that stack as soon as I can.
> 
> 4) Yes, those were log lines from the TM log in the web dashboard
> 
> Thank you very much for your help
> 
> 2017-08-31 10:29 GMT+02:00 Nico Kruber <nico@data-artisans.com>:
> > Hi Federico,
> > 1) Which version of Flink are you using?
> > 2) Can you also share the JobManager log?
> > 3) Why do you think, Flink is stuck at the BlobCache? Is it really
> > blocked, or
> > do you still have CPU load? Can you post stack traces of the TaskManager
> > (TM)
> > and JobManager processes when you think they are stuck (using jstack)?
> > 4) These PUT requests in the TM logs are strange, unless you showed the TM
> > logs in the web interface - did you?
> > 
> > 
> > Nico
> > 
> > On Thursday, 31 August 2017 09:45:59 CEST Fabian Hueske wrote:
> > > Hi Federico,
> > > 
> > > Not sure what's going on there but Nico (in CC) is more familiar with
> > > the
> > > blob cache and might be able to help.
> > > 
> > > Best, Fabian
> > > 
> > > 2017-08-30 15:35 GMT+02:00 Federico D'Ambrosio <fedexist@gmail.com>:
> > > > Hi,
> > > > 
> > > > I have a rather simple Flink job which has a KinesisConsumer as a
> > 
> > source
> > 
> > > > and an HBase table as sink, in which I write using writeOutputFormat.
> > 
> > I'm
> > 
> > > > running it on a local machine with a single taskmanager (2 slots, 2G).
> > 
> > The
> > 
> > > > KinesisConsumer works fine and the connection to the HBase table gets
> > > > opened fine (i.e. the open method of the class implementing
> > 
> > OutputFormat
> > 
> > > > gets actually called).
> > > > 
> > > > I'm running the job at a parallelism of 2, while the sink has a
> > > > parallelism of 1. The
> > > > 
> > > > Still, looking at the log I see that after opening the connection, the
> > 
> > job
> > 
> > > > gets stuck at lines like this one:
> > > > 
> > > > INFO  org.apache.flink.runtime.blob.BlobCache                       -
> > > > Downloading 8638bdf78b0e540786de6c291f710a8db447a2b4 from
> > > > localhost/127.0.0.1:43268
> > > > 
> > > > Each following one another, like this:
> > > > 
> > > > 2017-08-30 14:17:21,318 INFO  org.apache.flink.runtime.blob.BlobCache
> > > > 
> > > >                  - Created BLOB cache storage directory
> > > > 
> > > > /tmp/blobStore-8a2a96af-b836-4c95-b79a-a4b80929126f 2017-08-30
> > > > 14:17:21,321 DEBUG org.apache.flink.runtime.blob.BlobClient
> > > > 
> > > >        - PUT content addressable BLOB stream to /127.0.0.1:59937
> > > > 
> > > > 2017-08-30 14:17:21,323 DEBUG
> > > > org.apache.flink.runtime.blob.BlobServerConnection            -
> > 
> > Received
> > 
> > > > PUT request for content addressable BLOB 2017-08-30 14:17:21,324 INFO
> > > > org.apache.flink.runtime.blob.BlobCache                       -
> > > > Downloading 3ff486dff4c4eaafdab42b30a877326e62bfca82 from
> > > > localhost/127.0.0.1:43268 2017-08-30 14:17:21,324 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - GET
> > > > content addressable BLOB 3ff486dff4c4eaafdab42b30a877326e62bfca82 from
> > > > /127.0.0.1:59938 2017-08-30 14:18:13,708 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > > > content addressable BLOB stream to /127.0.0.1:59976 2017-08-30
> > > > 14:18:13,708 DEBUG org.apache.flink.runtime.blob.BlobServerConnection
> > > > 
> > > >        - Received PUT request for content addressable BLOB 2017-08-30
> > > > 
> > > > 14:18:13,710 INFO  org.apache.flink.runtime.blob.BlobCache
> > > > 
> > > >        - Downloading 2f5283326aab77faa047b705cd1d6470035b3b7d from
> > > > 
> > > > localhost/127.0.0.1:43268 2017-08-30 14:18:13,710 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - GET
> > > > content addressable BLOB 2f5283326aab77faa047b705cd1d6470035b3b7d from
> > > > /127.0.0.1:59978 2017-08-30 14:19:29,811 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > > > content addressable BLOB stream to /127.0.0.1:60022 2017-08-30
> > > > 14:19:29,812 DEBUG org.apache.flink.runtime.blob.BlobServerConnection
> > > > 
> > > >        - Received PUT request for content addressable BLOB 2017-08-30
> > > > 
> > > > 14:19:29,814 INFO  org.apache.flink.runtime.blob.BlobCache
> > > > 
> > > >        - Downloading f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from
> > > > 
> > > > localhost/127.0.0.1:43268 2017-08-30 14:19:29,814 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - GET
> > > > content addressable BLOB f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from
> > > > /127.0.0.1:60024 2017-08-30 14:21:42,856 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > > > content addressable BLOB stream to /127.0.0.1:60110 2017-08-30
> > > > 14:21:42,856 DEBUG org.apache.flink.runtime.blob.BlobServerConnection
> > > > 
> > > >        - Received PUT request for content addressable BLOB 2017-08-30
> > > > 
> > > > 14:21:42,858 INFO  org.apache.flink.runtime.blob.BlobCache
> > > > 
> > > >        - Downloading 8638bdf78b0e540786de6c291f710a8db447a2b4 from
> > > > 
> > > > localhost/127.0.0.1:43268 2017-08-30 14:21:42,859 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - GET
> > > > content addressable BLOB 8638bdf78b0e540786de6c291f710a8db447a2b4 from
> > > > /127.0.0.1:60112 2017-08-30 14:26:11,242 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > > > content addressable BLOB stream to /127.0.0.1:60295 2017-08-30
> > > > 14:26:11,243 DEBUG org.apache.flink.runtime.blob.BlobServerConnection
> > > > 
> > > >        - Received PUT request for content addressable BLOB 2017-08-30
> > > > 
> > > > 14:26:11,247 INFO  org.apache.flink.runtime.blob.BlobCache
> > > > 
> > > >        - Downloading 6d30c88539d511bb9acc13b53bb2a128614f5621 from
> > > > 
> > > > localhost/127.0.0.1:43268 2017-08-30 14:26:11,247 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - GET
> > > > content addressable BLOB 6d30c88539d511bb9acc13b53bb2a128614f5621 from
> > > > /127.0.0.1:60297 2017-08-30 14:29:20,942 DEBUG
> > > > org.apache.flink.runtime.blob.BlobClient                      - PUT
> > > > content addressable BLOB stream to /127.0.0.1:60410
> > > > 
> > > > 
> > > > My questions are: what is the jobmanager doing here? Why is he taking
> > 
> > ages
> > 
> > > > to do this? How do i speed up this behaviour?
> > > > 
> > > > Thank you very much for your attention,
> > > > 
> > > > Federico D'Ambrosio


Mime
View raw message