Return-Path: Delivered-To: apmail-db-derby-dev-archive@www.apache.org Received: (qmail 29262 invoked from network); 14 May 2009 05:01:02 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 14 May 2009 05:01:02 -0000 Received: (qmail 21072 invoked by uid 500); 14 May 2009 05:01:01 -0000 Delivered-To: apmail-db-derby-dev-archive@db.apache.org Received: (qmail 21002 invoked by uid 500); 14 May 2009 05:01:01 -0000 Mailing-List: contact derby-dev-help@db.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: Delivered-To: mailing list derby-dev@db.apache.org Received: (qmail 20994 invoked by uid 99); 14 May 2009 05:01:01 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 May 2009 05:01:01 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of hiranya911@gmail.com designates 74.125.92.26 as permitted sender) Received: from [74.125.92.26] (HELO qw-out-2122.google.com) (74.125.92.26) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 May 2009 05:00:53 +0000 Received: by qw-out-2122.google.com with SMTP id 5so870435qwd.13 for ; Wed, 13 May 2009 22:00:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=wR5SaBeHIYK8H/sdTE7H9GveiOwbPKt+Y8O5zh3euwA=; b=WDL2OVzTs2M5rmBIh3OuRpAfiwPWDQ6XuLiwpuoa06JV1eC/QMfU/A3c0C4KcMcu2d 36hM5OjyTmRDKDd3nfp7ZuJaoz+y18EmuynVeyOoKadwKfIx3PbDLlUdU+LONiu+pAgE JmCdeZ9FtCOOY6J0mGE+XhMUsywwbIRlV1u90= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=o395oEScytezA5dIRQ5Nw4pGqmsIhfb/96lGhLhhRRxTd+7GyefpFLuTddvxSkvbu4 eWIN/FL7OCOKu5lRKDxSPTZwkPrCpF1PPjp+Iht6hGbKObhBe5yhQnpfDsK52QeAUH+Z OKRfyf/YLJ4OHNEtP7dBCkC+t4IoLj9IJIwBo= MIME-Version: 1.0 Received: by 10.229.91.13 with SMTP id k13mr1367348qcm.98.1242277231954; Wed, 13 May 2009 22:00:31 -0700 (PDT) In-Reply-To: References: <558af1840905092155t7c0fb4a3rea54ea713e0080ff@mail.gmail.com> Date: Thu, 14 May 2009 10:30:31 +0530 Message-ID: <558af1840905132200kc1d7fb2y9a9ab75fcbb5d986@mail.gmail.com> Subject: Re: Wiki Page for dblook SQL Authorization Support Project From: Hiranya Jayathilaka To: derby-dev@db.apache.org Content-Type: multipart/alternative; boundary=0016361e81341f8e590469d83541 X-Virus-Checked: Checked by ClamAV on apache.org --0016361e81341f8e590469d83541 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Hi Dag, Thanks a lot for taking time to go through the wiki page and providing your valuable insight. I'll soon update the wiki with the things you have mentioned here. See my comments in-line. On Wed, May 13, 2009 at 9:59 PM, Dag H. Wanvik wrote: > > Hi Hiranya, > > Hiranya Jayathilaka writes: > > > Hi Folks, > > > > I'll be using the wiki page at [1] as the documentation space for dblook > SQL > > Authorization support implementation, during the design phase of the > > project. I've already updated the wiki page with some information that > I've > > discussed with Dag off-line. Feel free to add your own ideas to the wiki > > page regarding this project and also feel free to discuss the matters > > mentioned in the wiki on the mailing list. > > > > Your feedback and comments are most appreciated. > > Thanks for starting a wiki page for this design, Hiranya! This is a > good way to go. I think the approach looks good. > > Here are some answers to your questions and comments. > > > Design > > > > We intend to use a directed graph as the means of tracking and > > representing the dependencies among various persistent objects in a > > database. Let's call it the dblook dependency graph. When dblook is > > fired off against a database, it would create the dependency graph in > > memory and 'walk' the graph step-by-step while producing DDL > > statements required to reconstruct those objects. > > To do this, we have to use several connections, one for each > authorized user that owns database objects, so that they can be > recreated with the correct owner. It could be useful to keep them open > (or you could open them on demand; slower) since object creation for > one user could be interleaved with an other user's objects, if > dependencies go in both directions between users. But this is probably > not so common that it warrants optimizing of connection openings? > > > > > All persistent objects of a database would be vertices in the > > dependency graph. If A and B are two persistent objects in a database > > and if B is a dependent object of A then there will be a directed edge > > from A to B in the dependency graph (A --> B). Information required to > > construct the dependency graph will be fetched from the system tables > > (specially from the SYSDEPENDS table which effectively captures all > > such dependencies among persistent objects). The graph construction > > algorithm should also associate each vertex with a database user as > > the owner. (what is the right way to find the object owner?) > > For example, for a view, the SYSVIEWS table contains a reference to > the owning schema in the column COMPILATIONSCHEMAID. The system table > for schemas, SYSSCHEMAS, contains AUTHORIZATIONID for each > schema. This is the owning user user (since Derby does not yet support > a schema being owned by a role, although the SQL standard does allow > that). So joining COMPILATIONSCHEMAID with SYSSCHEMAS.SCHEMAID gives > the owning user of a view. The case is similar for triggers and > constraints. Thanks. That answers it :) > > > > In addition it should capture all the permissions associated with > > each object and the users involved with those permissions. > > More precisely, all permissions granted for an object, be it to a > user, to a role or to PUBLIC, should be recorded with the object, so > those permissions can be granted once the object has been recreated. > > If a persistent object depends on a role, that should be recorded as > well, so that the needed role can be set before recreating the object. +1 > > > > > > Once we have the full dblook dependency graph in memory we can create > > all the roles required (should be done as the dbo). > > And all the role grants should be performed in this step, that is, > grant to users, to other roles and to PUBLIC. Only dbo can do these > things presently. Note that in order to do this step successfully, you > need to create a GRANT dependency graph of the roles' > interrelationships as well. Yes. Thanks for pointing that out. So I guess we need to add this as a step of our overall execution (may be prior to creating the dependency graph?). I think we can find all the necessary data by just looking at the SYSROLES table to do this. > > > > > When dblook finally walks the graph it will need to produce an > > authentication statement prior to producing the actual DDL statement > > required to create each object. > > I am not sure what you mean by authentication statement here; I think > we need to open a connection per user that owns a persistent object > and use that connection to create that object. Yes. That's what I meant. Basically we need to generate a statement which establishes the necessary database connection, when executed. I'll rephrase this on the wiki so it's more clear to the reader. > So the walk would have > to switch between the different connections as needed? In addition, if > this persistent object ("soon-to-be-created") depends on a role, the > connection should SET that role first, too, so we can make sure we > have the required permissions to create the object. > > > For an example if there is an object O associated with the user U in > > I assume you mean "owned by" user U here. +1 > > > > the graph, dblook will first output an authentication statement for > > user U before producing the DDL statement for O. After producing the > > creation statement necessary grant statements should be created > > related to the object. > > Yes, this seems the right time! > > > Basically all the permissions related to the object should be > > granted to the associated users. > > .. and roles and PUBLIC. > > > > Walking the graph should start from a node which does not have any > > in bound edges (i.e. it does not depend on any other object). As the > > graph walk continues objects are removed from the graph along with > > all edges incident on them. The walk continues until the graph's set > > of vertices is empty. > > Dependency Graph Design > > > > The datastructures used to store the elements of the dependency graph > > should be lightweight. This graph will be held in memory and its size > > will be proportional to the number of persistent objects in the > > database. Therefore larger the database, larger the dependency > > graph. We need to be conservative in memory usage to efficiently deal > > with large databases. > > > > Each vertex in the graph should store a set of in bound edges, set of > > out bound edges, an associated user ID (might also need to store > > authentication credentials of the user) and a set of permissions. > > Yes, and you will need to store if an object depends on a role, too. > The set of permissions is those that should be granted for that object > (vertex; I presume you intend for the vertex to represent a persistent > object here?) Indeed. Each persistent object will be a vertex in the graph. Thanks, Hiranya > > Dag > -- Hiranya Jayathilaka E-mail: hiranya@apache.org; Mobile: +94 77 633 3491 Blog: http://techfeast-hiranya.blogspot.com --0016361e81341f8e590469d83541 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi Dag,

Thanks a lot for taking time to go through the wiki page and= providing your valuable insight. I'll soon update the wiki with the th= ings you have mentioned here. See my comments in-line.

On Wed, May 13, 2009 at 9:59 PM, Dag H. Wanvik <Dag.Wanvik@sun.com> wrote:

Hi Hiranya,

Hiranya Jayathilaka <hiranya911@= gmail.com> writes:

> Hi Folks,
>
> I'll be using the wiki page at [1] as the documentation space for = dblook SQL
> Authorization support implementation, during the design phase of the > project. I've already updated the wiki page with some information = that I've
> discussed with Dag off-line. Feel free to add your own ideas to the wi= ki
> page regarding this project and also feel free to discuss the matters<= br> > mentioned in the wiki on the mailing list.
>
> Your feedback and comments are most appreciated.

Thanks for starting a wiki page for this design, Hiranya! This is a good way to go. =A0I think the approach looks good.

Here are some answers to your questions and comments.

> Design
>
> We intend to use a directed graph as the means of tracking and
> representing the dependencies among various persistent objects in a > database. Let's call it the dblook dependency graph. When dblook i= s
> fired off against a database, it would create the dependency graph in<= br> > memory and 'walk' the graph step-by-step while producing DDL > statements required to reconstruct those objects.

To do this, we have to use several connections, one for each
authorized user that owns database objects, so that they can be
recreated with the correct owner. It could be useful to keep them open
(or you could open them on demand; slower) since object creation for
one user could be interleaved with an other user's objects, if
dependencies go in both directions between users. But this is probably
not so common that it warrants optimizing of connection openings?

>
> All persistent objects of a database would be vertices in the
> dependency graph. If A and B are two persistent objects in a database<= br> > and if B is a dependent object of A then there will be a directed edge=
> from A to B in the dependency graph (A --> B). Information required= to
> construct the dependency graph will be fetched from the system tables<= br> > (specially from the SYSDEPENDS table which effectively captures all > such dependencies among persistent objects). The graph construction > algorithm should also associate each vertex with a database user as > the owner. (what is the right way to find the object owner?)

For example, for a view, the SYSVIEWS table contains a reference to
the owning schema in the column COMPILATIONSCHEMAID. =A0The system table for schemas, SYSSCHEMAS, contains AUTHORIZATIONID for each
schema. This is the owning user user (since Derby does not yet support
a schema being owned by a role, although the SQL standard does allow
that). So joining COMPILATIONSCHEMAID with SYSSCHEMAS.SCHEMAID gives
the owning user of a view. The case is similar for triggers and
constraints.

Thanks. That answers it :)


> In addition it should capture all the permissions associated with
> each object and the users involved with those permissions.

More precisely, all permissions granted for an object, be it to a
user, to a role or to PUBLIC, should be recorded with the object, so
those permissions can be granted once the object has been recreated.

If a persistent object depends on a role, that should be recorded as
well, so that the needed role can be set before recreating the object.

+1
=A0


>
> Once we have the full dblook dependency graph in memory we can create<= br> > all the roles required (should be done as the dbo).

And all the role grants should be performed in this step, that is,
grant to users, to other roles and to PUBLIC. Only dbo can do these
things presently. Note that in order to do this step successfully, you
need to create a GRANT dependency graph of the roles'
interrelationships as well.

Yes. Thanks for pointing t= hat out. So I guess we need to add this as a step of our overall execution = (may be prior to creating the dependency graph?). I think we can find all t= he necessary data by just looking at the SYSROLES table to do this.



>
> When dblook finally walks the graph it will need to produce an
> authentication statement prior to producing the actual DDL statement > required to create each object.

I am not sure what you mean by authentication statement here; I think
we need to open a connection per user that owns a persistent object
and use that connection to create that object.

Yes. T= hat's what I meant. Basically we need to generate a statement which est= ablishes the necessary database connection, when executed. I'll rephras= e this on the wiki so it's more clear to the reader.
=A0
So the w= alk would have
to switch between the different connections as needed? In addition, if
this persistent object ("soon-to-be-created") depends on a role, = the
connection should SET that role first, too, so we can make sure we
have the required permissions to create the object.

> For an example if there is an object O associated with the user U in
I assume you mean "owned by" user U here.

+1=
=A0
=

> the graph, dblook will first output an authentication statement for > user U before producing the DDL statement for O. After producing the > creation statement necessary grant statements should be created
> related to the object.

Yes, this seems the right time!

> Basically all the permissions related to the object should be
> granted to the associated users.

.. and roles and PUBLIC.


> Walking the graph should start from a node which does not have any
> in bound edges (i.e. it does not depend on any other object). As the > graph walk continues objects are removed from the graph along with
> all edges incident on them. The walk continues until the graph's s= et
> of vertices is empty.

Dependency Graph Design
>
> The datastructures used to store the elements of the dependency graph<= br> > should be lightweight. This graph will be held in memory and its size<= br> > will be proportional to the number of persistent objects in the
> database. Therefore larger the database, larger the dependency
> graph. We need to be conservative in memory usage to efficiently deal<= br> > with large databases.
>
> Each vertex in the graph should store a set of in bound edges, set of<= br> > out bound edges, an associated user ID (might also need to store
> authentication credentials of the user) and a set of permissions.

Yes, and you will need to store if an object depends on a role, too.
The set of permissions is those that should be granted for that object
(vertex; I presume you intend for the vertex to represent a persistent
object here?)

Indeed. Each persistent object will be a= vertex in the graph.

Thanks,
Hiranya



Dag



--
Hiranya Jayathil= aka
E-mail: hiranya@apache.org= ; =A0Mobile: +94 77 633 3491
Blog: http://techfeast-hiranya.blogspot.com
--0016361e81341f8e590469d83541--