flink-user-zh mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wei Zhong <weizhong0...@gmail.com>
Subject Re: yarn-session模式通过python api消费kafka数据报错
Date Tue, 10 Dec 2019 02:23:09 GMT
Hi 改改,

看现在的报错,可能是kafka版本不匹配,你需要放入lib目录的kafka connector
需要是0.11版本的,即flink-sql-connector-kafka-0.11_2.11-1.9.1.jar

> 在 2019年12月10日,10:06,改改 <vfvwww@dingtalk.com> 写道:
> 
> HI Wei Zhong ,
>          感谢您的回复,flink的lib目录下已经放了kafka connector的jar包的,我的flink/lib目录下文件目录如下:
>          
>          <5600791664319709.png>
> 
> 另外我的集群环境如下:
>      java :1.8.0_231
>      flink: 1.9.1
>      Python 3.6.9
>      Hadoop 3.1.1.3.1.4.0-315
> 
> 昨天试了下用python3.6 执行,依然是报错的,报错如下:
> 
> [root@hdp02 data_team_workspace]# /opt/flink-1.9.1/bin/flink run -py tumble_window.py
> Starting execution of program
> Traceback (most recent call last):
>   File "/tmp/pyflink/3fb6ccfd-482f-4426-859a-ebe003e14769/pyflink.zip/pyflink/util/exceptions.py",
line 147, in deco
>   File "/tmp/pyflink/3fb6ccfd-482f-4426-859a-ebe003e14769/py4j-0.10.8.1-src.zip/py4j/protocol.py",
line 328, in get_return_value
> py4j.protocol.Py4JJavaError: An error occurred while calling o42.registerTableSource.
> : org.apache.flink.table.api.TableException: findAndCreateTableSource failed.
>  at org.apache.flink.table.factories.TableFactoryUtil.findAndCreateTableSource(TableFactoryUtil.java:67)
>  at org.apache.flink.table.factories.TableFactoryUtil.findAndCreateTableSource(TableFactoryUtil.java:54)
>  at org.apache.flink.table.descriptors.ConnectTableDescriptor.registerTableSource(ConnectTableDescriptor.java:69)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at org.apache.flink.api.python.shaded.py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
>  at org.apache.flink.api.python.shaded.py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
>  at org.apache.flink.api.python.shaded.py4j.Gateway.invoke(Gateway.java:282)
>  at org.apache.flink.api.python.shaded.py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
>  at org.apache.flink.api.python.shaded.py4j.commands.CallCommand.execute(CallCommand.java:79)
>  at org.apache.flink.api.python.shaded.py4j.GatewayConnection.run(GatewayConnection.java:238)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.flink.table.api.NoMatchingTableFactoryException: Could not find
a suitable table factory for 'org.apache.flink.table.factories.TableSourceFactory' in
> the classpath.
> Reason: No context matches.
> The following properties are requested:
> connector.properties.0.key=zookeeper.connect
> connector.properties.0.value=hdp03:2181
> connector.properties.1.key=bootstrap.servers
> connector.properties.1.value=hdp02:6667
> connector.property-version=1
> connector.startup-mode=earliest-offset
> connector.topic=user_01
> connector.type=kafka
> connector.version=0.11
> format.fail-on-missing-field=true
> format.json-schema={  type: 'object',  properties: {    col1: {      type: 'string' 
  },    col2: {      type: 'string'    },    col3: {      type: 'string'    },    time: {
     type: 'string',      format: 'date-time'    }  }}
> format.property-version=1
> format.type=json
> schema.0.name=rowtime
> schema.0.rowtime.timestamps.from=time
> schema.0.rowtime.timestamps.type=from-field
> schema.0.rowtime.watermarks.delay=60000
> schema.0.rowtime.watermarks.type=periodic-bounded
> schema.0.type=TIMESTAMP
> schema.1.name=col1
> schema.1.type=VARCHAR
> schema.2.name=col2
> schema.2.type=VARCHAR
> schema.3.name=col3
> schema.3.type=VARCHAR
> update-mode=append
> The following factories have been considered:
> org.apache.flink.streaming.connectors.kafka.KafkaTableSourceSinkFactory
> org.apache.flink.formats.csv.CsvRowFormatFactory
> org.apache.flink.addons.hbase.HBaseTableFactory
> org.apache.flink.api.java.io.jdbc.JDBCTableSourceSinkFactory
> org.apache.flink.formats.json.JsonRowFormatFactory
> org.apache.flink.table.catalog.GenericInMemoryCatalogFactory
> org.apache.flink.table.sources.CsvBatchTableSourceFactory
> org.apache.flink.table.sources.CsvAppendTableSourceFactory
> org.apache.flink.table.sinks.CsvBatchTableSinkFactory
> org.apache.flink.table.sinks.CsvAppendTableSinkFactory
> org.apache.flink.table.planner.StreamPlannerFactory
> org.apache.flink.table.executor.StreamExecutorFactory
> org.apache.flink.table.planner.delegation.BlinkPlannerFactory
> org.apache.flink.table.planner.delegation.BlinkExecutorFactory
>  at org.apache.flink.table.factories.TableFactoryService.filterByContext(TableFactoryService.java:283)
>  at org.apache.flink.table.factories.TableFactoryService.filter(TableFactoryService.java:191)
>  at org.apache.flink.table.factories.TableFactoryService.findSingleInternal(TableFactoryService.java:144)
> ▽
>  at org.apache.flink.table.factories.TableFactoryService.find(TableFactoryService.java:97)
>  at org.apache.flink.table.factories.TableFactoryUtil.findAndCreateTableSource(TableFactoryUtil.java:64)
>  ... 13 more
> During handling of the above exception, another exception occurred:
> Traceback (most recent call last):
>   File "/usr/python3.6/lib/python3.6/runpy.py", line 193, in _run_module_as_main
>     "__main__", mod_spec)
>   File "/usr/python3.6/lib/python3.6/runpy.py", line 85, in _run_code
>     exec(code, run_globals)
>   File "/tmp/pyflink/3fb6ccfd-482f-4426-859a-ebe003e14769/tumble_window.py", line 62,
in <module>
>     .register_table_source("source")
>   File "/tmp/pyflink/3fb6ccfd-482f-4426-859a-ebe003e14769/pyflink.zip/pyflink/table/descriptors.py",
line 1293, in register_table_source
>   File "/tmp/pyflink/3fb6ccfd-482f-4426-859a-ebe003e14769/py4j-0.10.8.1-src.zip/py4j/java_gateway.py",
line 1286, in __call__
>   File "/tmp/pyflink/3fb6ccfd-482f-4426-859a-ebe003e14769/pyflink.zip/pyflink/util/exceptions.py",
line 154, in deco
> pyflink.util.exceptions.TableException: 'findAndCreateTableSource failed.'
> org.apache.flink.client.program.OptimizerPlanEnvironment$ProgramAbortException
>  at org.apache.flink.client.python.PythonDriver.main(PythonDriver.java:83)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:576)
>  at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:438)
>  at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:274)
>  at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:746)
>  at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:273)
>  at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:205)
>  at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1010)
>  at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1083)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>  at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
>  at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1083)
> 
> 
> 
> 
> ------------------------------------------------------------------
> 发件人:Wei Zhong <weizhong0618@gmail.com>
> 发送时间:2019年12月10日(星期二) 09:56
> 收件人:user-zh <user-zh@flink.apache.org>; 改改 <vfvwww@dingtalk.com>
> 主 题:Re: yarn-session模式通过python api消费kafka数据报错
> 
> Hi 改改,
> 
> 只看这个报错的话信息量太少不能确定,不过一个可能性比较大的原因是kafka
connector的jar包没有放到lib目录下,能否检查一下你的flink的lib目录下是否存在kafka
connector的jar包?
> 
> > 在 2019年12月6日,14:36,改改 <vfvwww@dingtalk.com.INVALID> 写道:
> > 
> > 
> > [root@hdp02 bin]# ./flink run -yid application_1575352295616_0014 -py /opt/tumble_window.py
> > 2019-12-06 14:15:48,262 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli    
            - Found Yarn properties file under /tmp/.yarn-properties-root.
> > 2019-12-06 14:15:48,262 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli    
            - Found Yarn properties file under /tmp/.yarn-properties-root.
> > 2019-12-06 14:15:48,816 INFO  org.apache.hadoop.yarn.client.RMProxy            
            - Connecting to ResourceManager at hdp02.wuagecluster/10.2.19.32:8050
> > 2019-12-06 14:15:48,964 INFO  org.apache.hadoop.yarn.client.AHSProxy           
            - Connecting to Application History server at hdp03.wuagecluster/10.2.19.33:10200
> > 2019-12-06 14:15:48,973 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli    
            - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor
to locate the jar
> > 2019-12-06 14:15:48,973 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli    
            - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor
to locate the jar
> > 2019-12-06 14:15:49,101 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor
          - Found application JobManager host name 'hdp07.wuagecluster' and port '46376' from
supplied application id 'application_1575352295616_0014'
> > Starting execution of program
> > Traceback (most recent call last):
> >  File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
> >    "__main__", fname, loader, pkg_name)
> >  File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
> >    exec code in run_globals
> >  File "/tmp/pyflink/b9a29ae4-89ac-4289-9111-5f77ad90d386/tumble_window.py", line
62, in <module>
> >    .register_table_source("source")
> >  File "/tmp/pyflink/b9a29ae4-89ac-4289-9111-5f77ad90d386/pyflink.zip/pyflink/table/descriptors.py",
line 1293, in register_table_source
> >  File "/tmp/pyflink/b9a29ae4-89ac-4289-9111-5f77ad90d386/py4j-0.10.8.1-src.zip/py4j/java_gateway.py",
line 1286, in __call__
> >  File "/tmp/pyflink/b9a29ae4-89ac-4289-9111-5f77ad90d386/pyflink.zip/pyflink/util/exceptions.py",
line 154, in deco
> > pyflink.util.exceptions.TableException: u'findAndCreateTableSource failed.'
> > org.apache.flink.client.program.OptimizerPlanEnvironment$ProgramAbortException
> > at org.apache.flink.client.python.PythonDriver.main(PythonDriver.java:83)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:498)
> > at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:576)
> > at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:438)
> > at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:274)
> > at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:746)
> > at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:273)
> > at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:205)
> > at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1010)
> > at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1083)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at javax.security.auth.Subject.doAs(Subject.java:422)
> > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> > at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1083)
> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message