flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "P. Ramanjaneya Reddy" <ramanji...@gmail.com>
Subject Re: How to run wordcount program on Multi-node Cluster.
Date Mon, 07 Aug 2017 14:03:05 GMT
Hi Timo,
Problem is resolved after copy input file to all tasks managers.

and where should generate outputfile? Is it in jobmanager or task manager?

Where can i see the execution logs to understand how word count done each
task manager?


By the way any option to overwride...?

08/07/2017 19:27:00 Keyed Aggregation -> Sink: Unnamed(1/1) switched to
FAILED
java.io.IOException: File or directory already exists. Existing files and
directories are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode to
overwrite existing files and directories.
at
org.apache.flink.core.fs.FileSystem.initOutPathLocalFS(FileSystem.java:763)
at
org.apache.flink.core.fs.SafetyNetWrapperFileSystem.initOutPathLocalFS(SafetyNetWrapperFileSystem.java:135)
at
org.apache.flink.api.common.io.FileOutputFormat.open(FileOutputFormat.java:231)
at
org.apache.flink.api.java.io.TextOutputFormat.open(TextOutputFormat.java:78)
at
org.apache.flink.streaming.api.functions.sink.OutputFormatSinkFunction.open(OutputFormatSinkFunction.java:61)
at
org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
at
org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:111)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:376)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:253)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
at java.lang.Thread.run(Thread.java:745)


On Mon, Aug 7, 2017 at 6:49 PM, Timo Walther <twalthr@apache.org> wrote:

> Make sure that the file exists and is accessible from all Flink tasks
> managers.
>
>
> Am 07.08.17 um 14:35 schrieb P. Ramanjaneya Reddy:
>
>> Thank you Timo.
>>
>>
>> root1@root1-HP-EliteBook-840-G2:~/NAI/Tools/BEAM/Flink_Clust
>> er/rama/flink$
>> *./bin/flink
>> run ./examples/streaming/WordCount.jar --input
>> file:///home/root1/hamlet.txt --output file:///home/root1/wordcount_out*
>>
>>
>>
>> Execution of worcountjar gives error...
>>
>> 08/07/2017 18:03:16 Source: Custom File Source(1/1) switched to FAILED
>> java.io.FileNotFoundException: The provided file path
>> file:/home/root1/hamlet.txt does not exist.
>> at
>> org.apache.flink.streaming.api.functions.source.ContinuousFi
>> leMonitoringFunction.run(ContinuousFileMonitoringFunction.java:192)
>> at
>> org.apache.flink.streaming.api.operators.StreamSource.run(
>> StreamSource.java:87)
>> at
>> org.apache.flink.streaming.api.operators.StreamSource.run(
>> StreamSource.java:55)
>> at
>> org.apache.flink.streaming.runtime.tasks.SourceStreamTask.
>> run(SourceStreamTask.java:95)
>> at
>> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(
>> StreamTask.java:263)
>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
>> at java.lang.Thread.run(Thread.java:748)
>>
>>
>> On Mon, Aug 7, 2017 at 5:56 PM, Timo Walther <twalthr@apache.org> wrote:
>>
>> Hi Ramanji,
>>>
>>> you can find the source code of the examples here:
>>> https://github.com/apache/flink/blob/master/flink-examples/
>>> flink-examples-streaming/src/main/java/org/apache/flink/
>>> streaming/examples/wordcount/WordCount.java
>>>
>>> A general introduction how the cluster execution works can be found here:
>>> https://ci.apache.org/projects/flink/flink-docs-release-1.4/
>>> concepts/programming-model.html#programs-and-dataflows
>>> https://ci.apache.org/projects/flink/flink-docs-release-1.4/
>>> concepts/runtime.html
>>>
>>> It might also be helpful to have a look at the web interface which can
>>> show you a nice graph of the job.
>>>
>>> I hope this helps. Feel free to ask further questions.
>>>
>>> Regards,
>>> Timo
>>>
>>>
>>> Am 07.08.17 um 14:00 schrieb P. Ramanjaneya Reddy:
>>>
>>> Hello Everyone,
>>>
>>>> I have followed the steps specified below link to Install & Run Apache
>>>> Flink on Multi-node Cluster.
>>>>
>>>> http://data-flair.training/blogs/install-run-deploy-flink-
>>>> multi-node-cluster/
>>>>    used flink-1.3.2-bin-hadoop27-scala_2.10.tgz for install
>>>>
>>>> using the command
>>>>    " bin/flink run
>>>> /home/root1/NAI/Tools/BEAM/Flink_Cluster/rama/flink/examples
>>>> /streaming/WordCount.jar"
>>>> able to run wordcount, but where can i see which input consider and
>>>> output
>>>> generated?
>>>>
>>>> and how can i specify the input and output paths?
>>>>
>>>> I'm trying to understand how the wordcount will work using Multi-node
>>>> Cluster.?
>>>>
>>>> any suggestions will help me further understanding?
>>>>
>>>> Thanks & Regards,
>>>> Ramanji.
>>>>
>>>>
>>>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message