flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mihail Vieru <vi...@informatik.hu-berlin.de>
Subject Re: writeAsCsv not writing anything on HDFS when WriteMode set to OVERWRITE
Date Wed, 01 Jul 2015 18:30:55 GMT
Hi Max,

thank you for your reply. I wanted to revise and dismiss all other 
factors before writing back. I've attached you my code and sample input 
data.

I run the /APSPNaiveJob/ using the following arguments:

/0 100 hdfs://path/to/vertices-test-100 hdfs://path/to/edges-test-100 
hdfs://path/to/tempgraph 10 0.5 hdfs://path/to/output-apsp 9/

I was wrong, I originally thought that the first writeAsCsv call (line 
50) doesn't work. An exception is thrown without the WriteMode.OVERWRITE 
when the file exists.

But the problem lies with the second call (line 74), trying to write to 
the same path on HDFS.

This issue is blocking me, because I need to persist the vertices 
dataset between iterations.

Cheers,
Mihail

P.S.: I'm using the latest 0.10-SNAPSHOT and HDFS 1.2.1.


On 30.06.2015 16:51, Maximilian Michels wrote:
> HI Mihail,
>
> Thank you for your question. Do you have a short example that 
> reproduces the problem? It is hard to find the cause without an error 
> message or some example code.
>
> I wonder how your loop works without WriteMode.OVERWRITE because it 
> should throw an exception in this case. Or do you change the file 
> names on every write?
>
> Cheers,
> Max
>
> On Tue, Jun 30, 2015 at 3:47 PM, Mihail Vieru 
> <vieru@informatik.hu-berlin.de <mailto:vieru@informatik.hu-berlin.de>> 
> wrote:
>
>     I think my problem is related to a loop in my job.
>
>     Before the loop, the writeAsCsv method works fine, even in
>     overwrite mode.
>
>     In the loop, in the first iteration, it writes an empty folder
>     containing empty files to HDFS. Even though the DataSet it is
>     supposed to write contains elements.
>
>     Needless to say, this doesn't occur in a local execution
>     environment, when writing to the local file system.
>
>
>     I would appreciate any input on this.
>
>     Best,
>     Mihail
>
>
>
>     On 30.06.2015 12:10, Mihail Vieru wrote:
>>     Hi Till,
>>
>>     thank you for your reply.
>>
>>     I have the following code snippet:
>>
>>     /intermediateGraph.getVertices().writeAsCsv(tempGraphOutputPath,
>>     "\n", ";", WriteMode.OVERWRITE);/
>>
>>     When I remove the WriteMode parameter, it works. So I can reason
>>     that the DataSet contains data elements.
>>
>>     Cheers,
>>     Mihail
>>
>>
>>     On 30.06.2015 12:06, Till Rohrmann wrote:
>>>
>>>     Hi Mihail,
>>>
>>>     have you checked that the |DataSet| you want to write to HDFS
>>>     actually contains data elements? You can try calling |collect|
>>>     which retrieves the data to your client to see what’s in there.
>>>
>>>     Cheers,
>>>     Till
>>>
>>>     ​
>>>
>>>     On Tue, Jun 30, 2015 at 12:01 PM, Mihail Vieru
>>>     <vieru@informatik.hu-berlin.de
>>>     <mailto:vieru@informatik.hu-berlin.de>> wrote:
>>>
>>>         Hi,
>>>
>>>         the writeAsCsv method is not writing anything to HDFS
>>>         (version 1.2.1) when the WriteMode is set to OVERWRITE.
>>>         A file is created but it's empty. And no trace of errors in
>>>         the Flink or Hadoop logs on all nodes in the cluster.
>>>
>>>         What could cause this issue? I really really need this feature..
>>>
>>>         Best,
>>>         Mihail
>>>
>>>
>>
>
>


Mime
View raw message