flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Pompermaier <pomperma...@okkam.it>
Subject Re: HDFS directory rename
Date Fri, 17 Jul 2015 09:17:18 GMT
Of course I move the folder before the job starts or ends :)
My job does some transformation on the row data and put the results in
another folder.
The next time the job is executed checks whether the output folder exists
and, if so, it moves such folder to an archive dir.
I wanted to use the Flink client because is FS independent, so I can choose
which FS to use at runtime.
At the moment what I do is:

Path dataSourceArchivePath = new Path(rowChunksArchiveBaseDir,
dataSourceId);
dataSourceArchivePath.getFileSystem().mkdirs(dataSourceArchivePath.getParent());
boolean moved =
dataSourceArchivePath.getFileSystem().rename(dataSourceDirPath,
dataSourceArchivePath.getParent());
LOG.info("Archiving {} to {} {}", dataSourceDirPath,dataSourceArchivePath,
moved ? "successful" : "failed");

Moreover I still have to delete the empty subPaths of
the dataSourceArchivePath after the move but I can't do that because
there's no listChildren() on the Path object :(
I was looking for a simpler way to do this. Does it exists?

On Fri, Jul 17, 2015 at 10:08 AM, <fhueske@gmail.com> wrote:

>  Do you want to move the folder within a running job? This might cause a
> lot of problems, because you cannot (easily) control when a move
> command would be executed.
>
> Wouldn’t it be a better idea to do that after a job is finished and use
> the regular HDFS client?
>
> *From:* Flavio Pompermaier <pompermaier@okkam.it>
> *Sent:* ‎Friday‎, ‎17‎. ‎July‎, ‎2015 ‎10‎:‎02
> *To:* user@flink.apache.org
>
> Hi to all,
>
> in my Flink job I wanted to move a folder (containing other folders and
> files) to another location.
> For example, I wanted to move folder A to folder Y, where my HDFS looks
> like:
>
> myRootDir/X/a/aa/aaa/someFile1
> myRootDir/X/b/bb/bbb/someFile2
> myRootDir/Y
>
> I tried to use rename but it silently fails (rename just returns false) if
> the parent directory doesn't exists.
> Is there an easy way to do that with the Flink FS apis?
> If the rename() is intended to work that way, couldn't be useful a move()
> API..?
>
> Best,
> Flavio
>
>

Mime
View raw message