flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@gmail.com>
Subject Re: HDFS directory rename
Date Wed, 22 Jul 2015 11:17:58 GMT
listStatus() should return an empty array
On Jul 22, 2015 13:11, "Flavio Pompermaier" <pompermaier@okkam.it> wrote:

> I can detect if it's a dir but how can I detect if it's empty?
>
> On Wed, Jul 22, 2015 at 12:49 PM, Fabian Hueske <fhueske@gmail.com> wrote:
>
>> How about FileStatus[] FileSystem.listStatus()?
>> FileStatus gives the length of a file, the path, whether it's a dir, etc.
>>
>> 2015-07-22 11:04 GMT+02:00 Flavio Pompermaier <pompermaier@okkam.it>:
>>
>>> Ok. What I still not able to do is to recursively remove empty dirs from
>>> the source dir because there's no API for getChildrenCount() or
>>> getChildren() for a given Path.
>>> How can I do that?
>>>
>>> On Tue, Jul 21, 2015 at 3:13 PM, Stephan Ewen <sewen@apache.org> wrote:
>>>
>>>> I don't think there is a simpler way to do this.
>>>>
>>>> Flink follows the semantics of the Hadoop's HDFS file system there,
>>>> which behaves that way, and the Java File class.
>>>>
>>>> But it seems your solution is working, even if it needs a few extra
>>>> lines of code.
>>>>
>>>> On Fri, Jul 17, 2015 at 11:17 AM, Flavio Pompermaier <
>>>> pompermaier@okkam.it> wrote:
>>>>
>>>>> Of course I move the folder before the job starts or ends :)
>>>>> My job does some transformation on the row data and put the results in
>>>>> another folder.
>>>>> The next time the job is executed checks whether the output folder
>>>>> exists and, if so, it moves such folder to an archive dir.
>>>>> I wanted to use the Flink client because is FS independent, so I can
>>>>> choose which FS to use at runtime.
>>>>> At the moment what I do is:
>>>>>
>>>>> Path dataSourceArchivePath = new Path(rowChunksArchiveBaseDir,
>>>>> dataSourceId);
>>>>>
>>>>> dataSourceArchivePath.getFileSystem().mkdirs(dataSourceArchivePath.getParent());
>>>>> boolean moved =
>>>>> dataSourceArchivePath.getFileSystem().rename(dataSourceDirPath,
>>>>> dataSourceArchivePath.getParent());
>>>>> LOG.info("Archiving {} to {} {}",
>>>>> dataSourceDirPath,dataSourceArchivePath, moved ? "successful" : "failed");
>>>>>
>>>>> Moreover I still have to delete the empty subPaths of
>>>>> the dataSourceArchivePath after the move but I can't do that because
>>>>> there's no listChildren() on the Path object :(
>>>>> I was looking for a simpler way to do this. Does it exists?
>>>>>
>>>>> On Fri, Jul 17, 2015 at 10:08 AM, <fhueske@gmail.com> wrote:
>>>>>
>>>>>>  Do you want to move the folder within a running job? This might
>>>>>> cause a lot of problems, because you cannot (easily) control when
a move
>>>>>> command would be executed.
>>>>>>
>>>>>> Wouldn’t it be a better idea to do that after a job is finished
and
>>>>>> use the regular HDFS client?
>>>>>>
>>>>>> *From:* Flavio Pompermaier <pompermaier@okkam.it>
>>>>>> *Sent:* ‎Friday‎, ‎17‎. ‎July‎, ‎2015 ‎10‎:‎02
>>>>>> *To:* user@flink.apache.org
>>>>>>
>>>>>> Hi to all,
>>>>>>
>>>>>> in my Flink job I wanted to move a folder (containing other folders
>>>>>> and files) to another location.
>>>>>> For example, I wanted to move folder A to folder Y, where my HDFS
>>>>>> looks like:
>>>>>>
>>>>>> myRootDir/X/a/aa/aaa/someFile1
>>>>>> myRootDir/X/b/bb/bbb/someFile2
>>>>>> myRootDir/Y
>>>>>>
>>>>>> I tried to use rename but it silently fails (rename just returns
>>>>>> false) if the parent directory doesn't exists.
>>>>>> Is there an easy way to do that with the Flink FS apis?
>>>>>> If the rename() is intended to work that way, couldn't be useful
a
>>>>>> move() API..?
>>>>>>
>>>>>> Best,
>>>>>> Flavio
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
>
> --
>
> Flavio Pompermaier
>
> *Development Department*_______________________________________________
> *OKKAM**Srl **- www.okkam.it <http://www.okkam.it/>*
>
> *Phone:* +(39) 0461 283 702
> *Fax:* + (39) 0461 186 6433
> *Email:* pompermaier@okkam.it
> *Headquarters:* Trento (Italy), via G.B. Trener 8
> *Registered office:* Trento (Italy), via Segantini 23
>
> Confidentially notice. This e-mail transmission may contain legally
> privileged and/or confidential information. Please do not read it if you
> are not the intended recipient(S). Any use, distribution, reproduction or
> disclosure by any other person is strictly prohibited. If you have received
> this e-mail in error, please notify the sender and destroy the original
> transmission and its attachments without reading or saving it in any manner.
>
>

Mime
View raw message