mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Qian Zhang <zhq527...@gmail.com>
Subject Re: Volume ownership and permission
Date Sat, 18 Aug 2018 03:41:15 GMT
Thanks James for your comments! Please see my replies inline.lin

> I assume that this scheme will only be supported on Linux, due to the
dependencies on the Linux ACLs and supplementary group behaviour?

Yes.

> Rewriting ACLs on volumes at each container launch sounds hugely
expensive. It's IOP-bound process and there are an effectively unbounded
number of files in the volume. Would this serialize container cleanup?

My original thinking was, we only set ACLs on the volume root dir for each
container, do you think we need to do it for each file & sub-dir under the
volume root dir? If so, then instead of setting ACL another option is: when
the first time a volume is used by a container, volume ACL manager generate
a unique GID for the volume rather than the container, and do the two steps
below for the volume root dir and each file and sub-dir under it.

   1. change the owner group to the allocated GID
   2. set the `setgid` bit (this is just for dir but not file)

And when the second container tries to use the same volume, volume ACL
manager (we may give it another name) will just return the previous
allocated GID to the volume isolator and no need to do anything for the
volume. And volume ACL manager needs to maintain a reference count for each
volume for which containers are using the volume, and deallocate the GID
when no container is using the volume (i.e., reference count == 0). How do
you think?

And what did you mean for "serialize container cleanup"?

> It seems like ACL evaluation will mean that this scheme will only mostly
work. For example, if the container process UID matches a user ACE, then
access could be denied independently of the volume policy.

Did you mean the case that the supplementary group of the container process
is allowed to write to the volume (e.g., rwx) but the container process UID
is not allowed to write to the volume (e.g., r-x) and then the result is
the container can not write to the volume which is not what we expect?

> Will the VolumeAclManager apply a default ACL on the root of the volume?
Does this imply that when it updates the ACEs for the container GID, it
also needs to update the default ACLs on all directories?

Currently I do not think we need to set the default ACL on the volume root
dir.



Regards,
Qian Zhang

On Fri, Aug 17, 2018 at 12:38 AM, James Peach <jpeach@apache.org> wrote:

>
>
> > On Aug 15, 2018, at 6:22 PM, Qian Zhang <zhq527725@gmail.com> wrote:
> >
> > Hi Folks,
> >
> > We found some issues for the solutions of this project and propose a
> better
> > one, see here
> > <https://docs.google.com/document/d/1QyeDDX4Zr9E-0jKMoPTzsGE-
> v4KWwjmnCR0l8V4Tq2U/edit#heading=h.tjuy5xk67tuu>
> > for details. Please let me know if you have any comments, thanks!
>
> Some general comments.
>
> I assume that this scheme will only be supported on Linux, due to the
> dependencies on the Linux ACLs and supplementary group behaviour?
>
> Rewriting ACLs on volumes at each container launch sounds hugely
> expensive. It's IOP-bound process and there are an effectively unbounded
> number of files in the volume. Would this serialize container cleanup?
>
> It seems like ACL evaluation will mean that this scheme will only mostly
> work. For example, if the container process UID matches a user ACE, then
> access could be denied independently of the volume policy.
>
> Will the VolumeAclManager apply a default ACL on the root of the volume?
> Does this imply that when it updates the ACEs for the container GID, it
> also needs to update the default ACLs on all directories?
>
> >
> >
> > Regards,
> > Qian Zhang
> >
> > On Sat, Apr 28, 2018 at 7:57 AM, Qian Zhang <zhq527725@gmail.com> wrote:
> >
> >>> The framework launched tasks in a group with different users? Sounds
> >> like they dug their own hole :)
> >>
> >> So you mean we should actually put a best practice or limitation in doc:
> >> when launching a task group with multiple tasks to share a SANDBOX
> volume
> >> of PARENT type, all the tasks should be run with the same user, and that
> >> user must be same with the user to launch the executor? Otherwise the
> task
> >> will not be able to write to the volume.
> >>
> >>> I'd argue that the "rw" on the sandbox path is analogous to the "rw"
> >> mount option. That is, it is mounted writeable, but says nothing about
> >> which credentials can write to it.
> >>
> >> Can you please elaborate a bit on this? What would you suggest for the
> >> "rw` volume mode?
> >>
> >>
> >> Regards,
> >> Qian Zhang
> >>
> >> On Fri, Apr 27, 2018 at 12:07 PM, James Peach <jorgar@gmail.com> wrote:
> >>
> >>>
> >>>
> >>>> On Apr 26, 2018, at 7:25 PM, Qian Zhang <zhq527725@gmail.com>
wrote:
> >>>>
> >>>> Hi James,
> >>>>
> >>>> Thanks for your comment!
> >>>>
> >>>> I think you are talking about the SANDBOX_PATH volume ownership issue
> >>>> mentioned in the design doc
> >>>> <https://docs.google.com/document/d/1QyeDDX4Zr9E-0jKMoPTzsGE
> >>> -v4KWwjmnCR0l8V4Tq2U/edit#heading=h.s6f8rmu65g2p>,
> >>>> IIUC, you prefer to leaving it to framework, i.e., framework itself
> >>> ought
> >>>> to be able to handle such issue. But I am curious how framework can
> >>> handle
> >>>> it in such situation. If the framework launches a task group with
> >>> different
> >>>> users and with a SANDBOX_PATH volume of PARENT type, the tasks in the
> >>> group
> >>>> will definitely fail to write to the volume due to the ownership issue
> >>>> though the volume's mode is set to "rw". So in this case, how should
> >>>> framework handle it?
> >>>
> >>> The framework launched tasks in a group with different users? Sounds
> like
> >>> they dug their own hole :)
> >>>
> >>> I'd argue that the "rw" on the sandbox path is analogous to the "rw"
> >>> mount option. That is, it is mounted writeable, but says nothing about
> >>> which credentials can write to it.
> >>>
> >>>> And if we want to document it, what is our recommended
> >>>> solution in the doc?
> >>>>
> >>>>
> >>>>
> >>>> Regards,
> >>>> Qian Zhang
> >>>>
> >>>> On Fri, Apr 27, 2018 at 1:16 AM, James Peach <jpeach@apache.org>
> wrote:
> >>>>
> >>>>> I commented on the doc, but at least some of the issues raised there
> I
> >>>>> would not regard as issues. Rather, they are about setting
> expectations
> >>>>> correctly and ensuring that we are documenting (and maybe enforcing)
> >>>>> sensible behavior.
> >>>>>
> >>>>> I'm not that keen on Mesos automatically "fixing" filesystem
> >>> permissions
> >>>>> and we should proceed down that path with caution, especially in
the
> >>> ACLs
> >>>>> case.
> >>>>>
> >>>>>> On Apr 10, 2018, at 3:15 AM, Qian Zhang <zhq527725@gmail.com>
> wrote:
> >>>>>>
> >>>>>> Hi Folks,
> >>>>>>
> >>>>>> I am working on MESOS-8767 to improve Mesos volume support regarding
> >>>>> volume ownership and permission, here is the design doc. Please
feel
> >>> free
> >>>>> to let me know if you have any comments/feedbacks, you can reply
this
> >>> mail
> >>>>> or comment on the design doc directly. Thanks!
> >>>>>>
> >>>>>>
> >>>>>> Regards,
> >>>>>> Qian Zhang
> >>>>>
> >>>>>
> >>>
> >>>
> >>
>
>

Mime
View raw message