flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Metzger <rmetz...@apache.org>
Subject Re: Checking for existance of output directory/files before running a batch job
Date Fri, 19 Aug 2016 13:22:12 GMT
Ooops. Looks like Google Mail / Apache / the internet needs 13 minutes to
deliver an email.
Sorry for double answering.

On Fri, Aug 19, 2016 at 3:07 PM, Maximilian Michels <mxm@apache.org> wrote:

> HI Niels,
>
> Have you tried specifying the fully-qualified path? The default is the
> local file system.
>
> For example, hdfs:///path/to/foo
>
> If that doesn't work, do you have the same Hadoop configuration on the
> machine where you test?
>
> Cheers,
> Max
>
> On Thu, Aug 18, 2016 at 2:02 PM, Niels Basjes <Niels@basjes.nl> wrote:
> > Hi,
> >
> > I have a batch job that I run on yarn that creates files in HDFS.
> > I want to avoid running this job at all if the output already exists.
> >
> > So in my code (before submitting the job into yarn-session) I do this:
> >
> >     String directory = "foo";
> >
> >     Path directory = new Path(directoryName);
> >     FileSystem fs = directory.getFileSystem();
> >
> >     if (!fs.exists(directory)) {
> >
> >         // run the job
> >
> >     }
> >
> > What I found is that this code apparently checks the 'wrong' file
> system. (I
> > always get 'false' even if it exists in hdfs)
> >
> > I checked the API of the execution environment yet I was unable to get
> the
> > 'correct' filesystem from there.
> >
> > What is the proper way to check this?
> >
> >
> > --
> > Best regards / Met vriendelijke groeten,
> >
> > Niels Basjes
>

Mime
View raw message