flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@apache.org>
Subject Re: Looking for instructions & source for flink-java-examples-0.7.0-incubating-WebLogAnalysis.jar
Date Wed, 05 Nov 2014 19:11:08 GMT
Hi Anirvan,

just checked the data.
The data you use and the WebLogAnalysis example program do not work well
together and do not give you any results.
All tuples are removed by filters or joins.

Best, Fabian

2014-11-05 19:42 GMT+01:00 Fabian Hueske <fhueske@apache.org>:

> Hi Anirvan,
>
> actually, the execution logs look good. It is possible, that the provided
> data just does not "match" the code of the WebLogAnalysis example program.
> Maybe some filters are too selective. I will check that and let you know
> the result.
> Have you tried to run any other job such as WordCount?
>
> To answer your questions:
> - with ./log/, I refered to the log directory of your Flink setup. This
> directory contains .out files into which the stdout of the JobManager and
> the TaskManager processes is redirected.
> - It is possible to change the files into which the stdout is redirected.
> However, you have to manually adapt the bash start scripts for that.
>
> 2014-11-05 15:26 GMT+01:00 Anirvan BASU <anirvan.basu@inria.fr>:
>
>> Hello Fabien and everyone,
>>
>> In my previous post, I missed some of your questions from your last email.
>>
>> Here are my replies:
>> Have you checked the local file systems on all workers for output?
>> Yes, I did (in the case of using "file:///address/to/local/file" using
>> NFS). They were the same empty files.
>>
>> Did the job process any data at all? The jobs finishes within 1 second
>> (which is still possible for very small input data).
>> The data that was used was provided to me by robert Metzger. Please see
>> the link here:
>> https://github.com/rmetzger/scratch/tree/weblogdataexample/weblog
>> Actually, the first lines in the "rank" file had some problem with the
>> separators '|' It may be due to difference in coding between Linux machines
>> ... the programme would end up with some error always.
>> So I deleted the top few lines and then the programme finished with code
>> FINISHED but empty files :-((
>>
>> You can change the example program to write its output to the stdout by
>> replacing the writeAsCSV() by print(). The stdout of all workers is
>> redirected to the ./log/*.out files.
>> Question to you: What is the location of this stdtout ./log/*  ? I could
>> not find it anywhere - neither in my local directories nor in the system
>> root.
>> Question to you: Is it possible to change the location of the stdout by
>> changing the conf file flink-conf.yaml ? Which exact parameter should I
>> change ?
>>
>> Thanks in advance for all your help,
>> Anirvan
>>
>>
>> ------------------------------
>>
>> *From: *"Fabian Hueske" <fhueske@apache.org>
>> *To: *user@flink.incubator.apache.org
>> *Sent: *Tuesday, November 4, 2014 4:28:40 PM
>> *Subject: *Re: Looking for instructions & source for
>> flink-java-examples-0.7.0-incubating-WebLogAnalysis.jar
>>
>> Hi Anirvan,
>>
>> you specify input and output as files in the local file system
>> (file:///). Each worker needs access to the all input files, which means
>> that each worker needs (a copy of) these files in its local file system.
>> The common setup to use Flink in a distributed cluster is to use a
>> distributed data store such as HDFS (or a data store that can be accessed
>> by each node).
>> Using a shared file system (like NFS) that is mounted into each worker
>> would work, but remember, that all nodes will concurrently read and write
>> to the shared system.
>>
>> Have you checked the local file systems on all workers for output?
>> Did the job process any data at all? The jobs finishes within 1 second
>> (which is still possible for very small input data).
>>
>> You can change the example program to write its output to the stdout by
>> replacing the writeAsCSV() by print(). The stdout of all workers is
>> redirected to the ./log/*.out files.
>>
>> Best, Fabian
>>
>> 2014-11-04 16:08 GMT+01:00 Anirvan BASU <anirvan.basu@inria.fr>:
>>
>>> Hello Robert, Stephan et al,
>>>
>>> Hope you are doing fine in Berlin.
>>>
>>> I am getting back to you on my previous problem on the WebLogAnalysis
>>> example, after a long time.
>>>
>>> We are currently using Flink 0.7.0 over a 10-node cluster in
>>> Manager-Worker configuration.
>>>
>>> We ran the following command:
>>> $ ./flink/bin/flink run
>>> flink/examples/flink-java-examples-0.7.0-incubating-WebLogAnalysis.jar
>>> file:///home/abasu/examples/Weblogs/documents
>>> file:///home/abasu/examples/Weblogs/ranks
>>> file:///home/abasu/examples/Weblogs/visits
>>> file:///home/abasu/examples/Weblogs/result
>>>
>>> For the documents, rank and visits files, we used the data generated by
>>> you from this link:
>>> https://github.com/rmetzger/scratch/tree/weblogdataexample/weblog
>>>
>>> The program executed with the following output:
>>> 11/04/2014 14:58:12:    Job execution switched to status RUNNING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (1/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (1/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (2/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (2/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (3/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (3/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (4/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (4/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (5/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (5/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (6/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (6/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (7/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (7/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (8/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (8/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (9/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (9/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (1/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (1/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (2/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (2/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (3/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (3/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (4/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (4/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (5/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (5/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (6/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (6/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (7/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (7/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (8/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (8/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (9/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (9/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (1/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (1/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (2/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (2/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (3/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (3/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (4/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (4/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (5/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (5/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (6/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (1/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (6/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (7/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (7/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (8/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (8/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (9/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (9/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (2/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (5/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (6/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (7/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (8/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (2/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (9/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (5/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (6/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (8/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (7/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (3/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (9/9) switched to RUNNING
>>> 11/04/2014 14:58:12:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (1/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (1/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (2/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (2/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (8/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (2/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (3/9) switched to RUNNING
>>> 11/04/2014 14:58:12:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (2/9) switched to RUNNING
>>> 11/04/2014 14:58:12:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (3/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (3/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (1/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (1/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (1/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (3/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (1/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (4/9) switched to RUNNING
>>> 11/04/2014 14:58:12:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (1/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (4/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (5/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (6/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (4/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (7/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (9/9) switched to RUNNING
>>> 11/04/2014 14:58:12:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (3/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (1/9) switched to RUNNING
>>> 11/04/2014 14:58:12:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (4/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (4/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (6/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (6/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (5/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (5/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (6/9) switched to RUNNING
>>> 11/04/2014 14:58:12:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (5/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (2/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (2/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (4/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (2/9) switched to RUNNING
>>> 11/04/2014 14:58:12:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (7/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (7/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (8/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (8/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (7/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (3/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (3/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (8/9) switched to RUNNING
>>> 11/04/2014 14:58:12:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (9/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (9/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (3/9) switched to RUNNING
>>> 11/04/2014 14:58:12:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (9/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (4/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (4/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (4/9) switched to RUNNING
>>> 11/04/2014 14:58:12:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (5/9) switched to SCHEDULED
>>> 11/04/2014 14:58:12:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (5/9) switched to DEPLOYING
>>> 11/04/2014 14:58:12:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (5/9) switched to RUNNING
>>> 11/04/2014 14:58:13:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (6/9) switched to SCHEDULED
>>> 11/04/2014 14:58:13:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (6/9) switched to DEPLOYING
>>> 11/04/2014 14:58:13:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (6/9) switched to RUNNING
>>> 11/04/2014 14:58:13:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (7/9) switched to SCHEDULED
>>> 11/04/2014 14:58:13:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (7/9) switched to DEPLOYING
>>> 11/04/2014 14:58:13:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (7/9) switched to RUNNING
>>> 11/04/2014 14:58:13:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (8/9) switched to SCHEDULED
>>> 11/04/2014 14:58:13:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (8/9) switched to DEPLOYING
>>> 11/04/2014 14:58:13:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (8/9) switched to RUNNING
>>> 11/04/2014 14:58:13:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (9/9) switched to SCHEDULED
>>> 11/04/2014 14:58:13:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (9/9) switched to DEPLOYING
>>> 11/04/2014 14:58:13:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (9/9) switched to RUNNING
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (7/9) switched to
>>> SCHEDULED
>>> 11/04/2014 14:58:13:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (7/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (7/9) switched to
>>> DEPLOYING
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (8/9) switched to
>>> SCHEDULED
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (8/9) switched to
>>> DEPLOYING
>>> 11/04/2014 14:58:13:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (8/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (5/9) switched to
>>> SCHEDULED
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (5/9) switched to
>>> DEPLOYING
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (6/9) switched to
>>> SCHEDULED
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (6/9) switched to
>>> DEPLOYING
>>> 11/04/2014 14:58:13:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (5/9) switched to FINISHED
>>> 11/04/2014 14:58:13:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (4/9) switched to FINISHED
>>> 11/04/2014 14:58:13:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (6/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (4/9) switched to
>>> SCHEDULED
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (7/9) switched to
>>> RUNNING
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (4/9) switched to
>>> DEPLOYING
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (3/9) switched to
>>> SCHEDULED
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (3/9) switched to
>>> DEPLOYING
>>> 11/04/2014 14:58:13:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (3/9) switched to FINISHED
>>> 11/04/2014 14:58:13:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (2/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (2/9) switched to
>>> SCHEDULED
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (2/9) switched to
>>> DEPLOYING
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (1/9) switched to
>>> SCHEDULED
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (1/9) switched to
>>> DEPLOYING
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (5/9) switched to
>>> RUNNING
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (4/9) switched to
>>> RUNNING
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (3/9) switched to
>>> RUNNING
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (2/9) switched to
>>> RUNNING
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (8/9) switched to
>>> RUNNING
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (6/9) switched to
>>> RUNNING
>>> 11/04/2014 14:58:13:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (1/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (1/9) switched to
>>> RUNNING
>>> 11/04/2014 14:58:13:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (4/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (2/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (5/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (9/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (4/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (8/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (6/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (2/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (5/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (7/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (9/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (6/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (7/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (8/9) switched to FINISHED
>>> 11/04/2014 14:58:13:
>>> Join(org.apache.flink.api.java.operators.JoinOperator$ProjectFlatJoinFunction)
>>> (9/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (9/9) switched to
>>> SCHEDULED
>>> 11/04/2014 14:58:13:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (9/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (9/9) switched to
>>> DEPLOYING
>>> 11/04/2014 14:58:13:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (4/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (2/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (7/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (6/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (5/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (8/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (1/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/documents) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterDocByKeyWords)
>>> -> Map (Projection [0]) (3/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (1/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (1/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/visits) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterVisitsByDate)
>>> -> Map (Projection [0]) (3/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    CHAIN DataSource (CSV Input (|)
>>> file:/home/abasu/examples/Weblogs/ranks) -> Filter
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$FilterByRank)
>>> (3/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (9/9) switched to
>>> RUNNING
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (7/9) switched to
>>> FINISHED
>>> 11/04/2014 14:58:13:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (7/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (8/9) switched to
>>> FINISHED
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (5/9) switched to
>>> FINISHED
>>> 11/04/2014 14:58:13:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (8/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (6/9) switched to
>>> FINISHED
>>> 11/04/2014 14:58:13:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (5/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (6/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (4/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (4/9) switched to
>>> FINISHED
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (1/9) switched to
>>> FINISHED
>>> 11/04/2014 14:58:13:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (1/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (3/9) switched to
>>> FINISHED
>>> 11/04/2014 14:58:13:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (3/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (2/9) switched to
>>> FINISHED
>>> 11/04/2014 14:58:13:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (2/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    DataSink(CsvOutputFormat (path:
>>> file:/home/abasu/examples/Weblogs/result, delimiter: |)) (9/9) switched to
>>> FINISHED
>>> 11/04/2014 14:58:13:    CoGroup
>>> (org.apache.flink.examples.java.relational.WebLogAnalysis$AntiJoinVisits)
>>> (9/9) switched to FINISHED
>>> 11/04/2014 14:58:13:    Job execution switched to status FINISHED
>>>
>>> The following directory was created:
>>> /home/abasu/examples/Weblogs/result
>>> with 9 files (named 1 to 9)
>>> All these files are empty!
>>>
>>> Hence my naive question: Is this the expected output ? Or what should be
>>> the expected output for an error-free run ?
>>>
>>> Please let me know where we are going wrong?
>>> If possible do you have other data generated to try the WebLogAnalysis
>>> example ?
>>>
>>> Thanks in advance for your advice and help,
>>> Anirvan
>>>
>>>
>>>
>>>>>
>>>>>
>>>>>>>> Le 23/09/2014 17:22, rmetzger0 [via Apache Flink (Incubator) User
>>>>>>>> Mailing List archive.] a écrit :
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> you have to use the "WebLogDataGenerator" found here:
>>>>>>>> https://github.com/apache/incubator-flink/blob/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/relational/util/WebLogDataGenerator.java
>>>>>>>>
>>>>>>>> It accepts two arguments, the number of documents and visits.
>>>>>>>> The generated files are located in /tmp/documents /tmp/ranks and
>>>>>>>> /tmp/visits.
>>>>>>>> I've generated some sample data for you, located here:
>>>>>>>> https://github.com/rmetzger/scratch/tree/weblogdataexample/weblog
>>>>>>>>
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Robert
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Sep 23, 2014 at 4:05 PM, nirvanesque [via Apache Flink
>>>>>>>> (Incubator) User Mailing List archive.] <[hidden email]
>>>>>>>> <http://user/SendEmail.jtp?type=node&node=100&i=0>> wrote:
>>>>>>>>
>>>>>>>>> Hello Robert,
>>>>>>>>>
>>>>>>>>> Thanks as usual for all your help with the information.
>>>>>>>>>
>>>>>>>>> I'm trying in vain to create the different input files from the
>>>>>>>>> program source code but running into difficulties.
>>>>>>>>>
>>>>>>>>> Could you (or anyone else) please post here samples of the 4
>>>>>>>>> inputs that are required to run this program ?
>>>>>>>>>
>>>>>>>>> Thanks in advance,
>>>>>>>>> Anirvan
>>>>>>>>>
>>>>>>>>> Le 09/09/2014 23:54, rmetzger0 [via Apache Flink (Incubator) User
>>>>>>>>> Mailing List archive.] a écrit :
>>>>>>>>>
>>>>>>>>> Hi Anirvan,
>>>>>>>>>
>>>>>>>>> sorry for the late response. You've posted the question to Nabble,
>>>>>>>>> which is only a mirror of our actual mailing list at [hidden
>>>>>>>>> email] <http://user/SendEmail.jtp?type=node&node=99&i=0>. Sadly,
>>>>>>>>> the message is not automatically posted to the apache list because the
>>>>>>>>> apache server is rejecting the mails from nabble.
>>>>>>>>>  I've already asked and there is no way to change this behavior.
>>>>>>>>> So I actually saw the two messages you posted here by accident.
>>>>>>>>>
>>>>>>>>> Regarding your actual question:
>>>>>>>>> - The command line arguments for the WebLogAnalysis example are:
>>>>>>>>>    "WebLogAnalysis <documents path> <ranks path> <visits
>>>>>>>>> path> <result path>"
>>>>>>>>>
>>>>>>>>> - Regarding the "info -d" command. I think its an artifact from
>>>>>>>>> our old java API. I've filed an issue in JIRA:
>>>>>>>>> https://issues.apache.org/jira/browse/FLINK-1095 Lets see how we
>>>>>>>>> resolve it. Thanks for reporting this!
>>>>>>>>>
>>>>>>>>> You can find the source code of all of our examples in the source
>>>>>>>>> release of Flink (in the flink-examples/flink-java-examples project. You
>>>>>>>>> can also access the source (and hence the examples) through GitHub:
>>>>>>>>> https://github.com/apache/incubator-flink/blob/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/example/java/relational/WebLogAnalysis.java.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> To build the examples, you can run: "mvn clean package
>>>>>>>>> -DskipTests" in the "flink-examples/flink-java-examples" directory. This
>>>>>>>>> will re-build them.
>>>>>>>>>
>>>>>>>>> If you don't want to import the whole Flink project just for
>>>>>>>>> playing around with the examples, you can also create an empty maven
>>>>>>>>> project. This script:
>>>>>>>>> curl
>>>>>>>>> https://raw.githubusercontent.com/apache/incubator-flink/master/flink-quickstart/quickstart.sh |
>>>>>>>>> bash
>>>>>>>>>
>>>>>>>>> will automatically set everything up for you. Just import the
>>>>>>>>> "quickstart" project into Eclipse or IntelliJ. It will download all
>>>>>>>>> dependencies and package everything correctly. If you want to use an
>>>>>>>>> example there, just copy the Java file into the "quickstart" project.
>>>>>>>>>
>>>>>>>>> The examples are indeed a very good way to learn how to write
>>>>>>>>> Flink jobs.
>>>>>>>>>
>>>>>>>>> Please continue asking if you have further questions!
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>> Robert
>>>>>>>>>
>>>>>>>>>
>>>
>>>
>>
>>
>

Mime
View raw message