flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maximilian Michels <...@apache.org>
Subject Re: Wrong owner of HDFS output folder
Date Mon, 26 Oct 2015 14:35:29 GMT
The problem is that non-root processes may not be able to read root-owned
files/folders. Therefore, we cannot really check as a non-root users
whether root-owned clusters have been started. It's better not to run Flink
with root permissions.

You're welcome.


On Mon, Oct 26, 2015 at 3:23 PM, Flavio Pompermaier <pompermaier@okkam.it>

> I just stopped the cluster with stop-cluster.sh but I had to manually kill
> the root process because it was not able to terminate it using the
> aforementioned script.
> Then I restarted the cluster via start-cluster.sh and now all processes
> run with the user it was supposed to. Probably once in the past I started
> the services with sudo and then I was convinced to restart the cluster
> using the start/stop scripts but the job manager was never restarted
> actually..
> However I didn't get any error about that, I was just reading
> "No jobmanager daemon (pid: XXXX) is running anymore on myhost.test.it"
> Maybe the scripts could be improved to check such a situation?
> Thanks for the support,
> Flavio
> On Mon, Oct 26, 2015 at 3:14 PM, Flavio Pompermaier <pompermaier@okkam.it>
> wrote:
>> Yes, the job manager starts as a root process, while taskmanagers with my
>> user..is that normal?
>> I was convinced that start-cluster.sh was starting all processes with the
>> same user :O
>> On Mon, Oct 26, 2015 at 3:09 PM, Maximilian Michels <mxm@apache.org>
>> wrote:
>>> Hi Flavio,
>>> Are you runing your Flink cluster with root permissions? The directory
>>> to hold the output splits are created by the JobManager. So if you run then
>>> JobManager with root permissions, it will create a folder owned by root. If
>>> the task managers are not run with root permissions, this could be a
>>> problem.
>>> Cheers,
>>> Max
>>> On Mon, Oct 26, 2015 at 2:40 PM, Flavio Pompermaier <
>>> pompermaier@okkam.it> wrote:
>>>> Hi to all,
>>>> when I run my job within my hadoop cluster (both from command line and
>>>> from webapp) the output of my job (HDFS) works fine until I set the write
>>>> parallelism to 1 (the output file is created with the user running the
>>>> job). If I leave the default parallelism (>1) the job fails because it
>>>> creates a folder where the owner of the output folder is the root user and
>>>> the job cannot write the files of my user in that folder anymore. Am I
>>>> doing something wrong?
>>>> Best,
>>>> Flavio

View raw message