flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ufuk Celebi <...@apache.org>
Subject Re: Job stuck at "Assigning split to host..."
Date Thu, 23 Jul 2015 12:41:21 GMT
Hey Lydia,

it looks like the HBase client is losing its connection to HBase. Before that, everything
seems to be working just fine (X rows are read etc.).

Do you mind setting the log level to DEBUG and then posting the logs again?

– Ufuk

On 23 Jul 2015, at 14:12, Lydia Ickler <icklerly@googlemail.com> wrote:

> Hi,
> 
> I am trying to read data from a HBase Table via the HBaseReadExample.java
> Unfortunately, my run gets always stuck at the same position.
> Do you guys have any suggestions?
> 
> In the master node it says:
> 14:05:04,239 INFO  org.apache.flink.runtime.jobmanager.JobManager                - Received
job bb9560efb8117ce7e840bea2c4b967c1 (Flink Java Job at Thu Jul 23 14:04:57 CEST 2015).
> 14:05:04,268 INFO  org.apache.flink.addons.hbase.TableInputFormat                - Initializing
HBaseConfiguration
> 14:05:04,346 INFO  org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper        - Process
identifier=hconnection-0x6704d3b4 connecting to ZooKeeper ensemble=localhost:2181
> 14:05:04,347 INFO  org.apache.zookeeper.ZooKeeper                                - Initiating
client connection, connectString=localhost:2181 sessionTimeout=90000 watcher=hconnection-0x6704d3b40x0,
quorum=localhost:2181, baseZNode=/hbase
> 14:05:04,352 INFO  org.apache.zookeeper.ClientCnxn                               - Opening
socket connection to server 
> 127.0.0.1/127.0.0.1:2181
> . Will not attempt to authenticate using SASL (unknown error)
> 14:05:04,353 INFO  org.apache.zookeeper.ClientCnxn                               - Socket
connection established to 
> 127.0.0.1/127.0.0.1:2181
> , initiating session
> 14:05:04,376 INFO  org.apache.zookeeper.ClientCnxn                               - Session
establishment complete on server 
> 127.0.0.1/127.0.0.1:2181
> , sessionid = 0x24ebaaf7d06000a, negotiated timeout = 40000
> 14:05:04,637 INFO  org.apache.flink.addons.hbase.TableInputFormat                - Created
4 splits
> 14:05:04,637 INFO  org.apache.flink.addons.hbase.TableInputFormat                - created
split [0|[grips5:16020]|-|LUAD+5781]
> 14:05:04,637 INFO  org.apache.flink.addons.hbase.TableInputFormat                - created
split [1|[grips1:16020]|LUAD+5781|LUAD+7539]
> 14:05:04,637 INFO  org.apache.flink.addons.hbase.TableInputFormat                - created
split [2|[grips1:16020]|LUAD+7539|LUAD+8552]
> 14:05:04,637 INFO  org.apache.flink.addons.hbase.TableInputFormat                - created
split [3|[grips1:16020]|LUAD+8552|-]
> 14:05:04,641 INFO  org.apache.flink.runtime.jobmanager.JobManager                - Scheduling
job Flink Java Job at Thu Jul 23 14:04:57 CEST 2015.
> 14:05:04,642 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - CHAIN
DataSource (at createInput(ExecutionEnvironment.java:502) (org.apache.flink.addons.hbase.HBaseReadExample$1))
-> FlatMap (collect()) (1/1) (94de8700cefe1651558e25c98829a156) switched from CREATED to
SCHEDULED
> 14:05:04,643 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - CHAIN
DataSource (at createInput(ExecutionEnvironment.java:502) (org.apache.flink.addons.hbase.HBaseReadExample$1))
-> FlatMap (collect()) (1/1) (94de8700cefe1651558e25c98829a156) switched from SCHEDULED
to DEPLOYING
> 14:05:04,643 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Deploying
CHAIN DataSource (at createInput(ExecutionEnvironment.java:502) (org.apache.flink.addons.hbase.HBaseReadExample$1))
-> FlatMap (collect()) (1/1) (attempt #0) to grips4
> 14:05:04,647 INFO  org.apache.flink.runtime.jobmanager.JobManager                - Status
of job bb9560efb8117ce7e840bea2c4b967c1 (Flink Java Job at Thu Jul 23 14:04:57 CEST 2015)
changed to RUNNING.
> 14:05:07,537 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - CHAIN
DataSource (at createInput(ExecutionEnvironment.java:502) (org.apache.flink.addons.hbase.HBaseReadExample$1))
-> FlatMap (collect()) (1/1) (94de8700cefe1651558e25c98829a156) switched from DEPLOYING
to RUNNING
> 14:05:07,545 INFO  org.apache.flink.api.common.io.LocatableInputSplitAssigner    - Assigning
remote split to host grips4
> 14:05:08,338 INFO  org.apache.flink.api.common.io.LocatableInputSplitAssigner    - Assigning
remote split to host grips4
> 
> 
> And in node "grips4":
> 07,273 INFO  org.apache.zookeeper.ZooKeeper                                - Initiating
client connection, connectString=localhost:2181 sessionTi$
> 14:05:07,296 INFO  org.apache.zookeeper.ClientCnxn                               - Opening
socket connection to server 
> 127.0.0.1/127.0.0.1:2181
> . Will n$
> 14:05:07,300 INFO  org.apache.zookeeper.ClientCnxn                               - Socket
connection established to 
> 127.0.0.1/127.0.0.1:2181
> , initiatin$
> 14:05:07,332 INFO  org.apache.zookeeper.ClientCnxn                               - Session
establishment complete on server 
> 127.0.0.1/127.0.0.1:2181
> , s$
> 14:05:07,531 INFO  org.apache.flink.runtime.taskmanager.Task                     - CHAIN
DataSource (at createInput(ExecutionEnvironment.java:502) (org$
> 14:05:07,970 INFO  org.apache.flink.addons.hbase.TableInputFormat                - opening
split [3|[grips1:16020]|LUAD+8552|-]
> 14:05:08,223 INFO  org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation
 - Closing zookeeper sessionid=0x44ebaaf7d35000e
> 14:05:08,235 INFO  org.apache.zookeeper.ZooKeeper                                - Session:
0x44ebaaf7d35000e closed
> 14:05:08,235 INFO  org.apache.zookeeper.ClientCnxn                               - EventThread
shut down
> 14:05:08,337 INFO  org.apache.flink.addons.hbase.TableInputFormat                - Closing
split (scanned 129 rows)
> 14:05:08,343 INFO  org.apache.flink.addons.hbase.TableInputFormat                - opening
split [2|[grips1:16020]|LUAD+7539|LUAD+8552]
> 14:06:58,826 INFO  org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call
exception, tries=10, retries=35, retryTime=110483ms, msg=row 'L$
> 14:07:18,927 INFO  org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call
exception, tries=11, retries=35, retryTime=130584ms, msg=row 'L$
> 14:07:39,079 INFO  org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call
exception, tries=12, retries=35, retryTime=150735ms, msg=row 'L$
> 14:07:59,232 INFO  org.apache.hadoop.hbase.client.RpcRetryingCaller              - Call
exception, tries=13, retries=35, retryTime=170889ms, msg=row 'L$
> 
> 
> 


Mime
View raw message